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ABSTRACT 


nils  report  describes  the  progress  of  an  investigation  concerning 
the  verification  of  COBOL  programs.  TTie  report  contains  discussions 
of  program  verification,  the  COBOL  language,  and  the  role  of  structured 
programming  in  COBOL  verification.  TTie  report  also  contains  a presen- 
tation of  a COBOL  subset  suitable  for  an  experimental  verification 
system — its  syntax  and  semantics.  The  report  also  contains  a discus- 
sion of  the  assertion  language  and  rules  of  inference  to  be  used  in  a 
COBOL  verification  system. 


FOREWORD 

nils  document  was  prepared  under  the  authority  of  U.  S.  Army 
Research  Office  Contract  No.  DAHC  04-75-C-0011  in  accordance  with 
Part  11,  Article  4 of  the  contract,  and  was  prepared  by  Stanford 
Research  Institute  for  the  L',  S.  Army  Computer  Systems  Command, 

Tills  report  describes  some  preliminary  results  in  an  investigation 
concerning  the  verification  of  COBOL  programs. 
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1.  Introduction 

This  report  describes  the  progress  of  a project  Intended  to  study 
the  Issues  involved  In  the  verification  of  COBOL  programs,  and  to  produce 
some  simple  examples  of  verified  programs  In  a selected  subset  of  COBOL. 

Given  that  program  verification  is  useful  In  Improving  the  relia- 
bility of  programs,  and  that  It  Is  of  great  importance  that  COBOL  pro- 
grams be  reliable  (the  vast  majority  of  all  programming  Is  done  In 
COBOL),  It  is  certainly  worthwhile  to  examine  the  feasibility  of  applying 
verification  techniques  to  COBOL  programs.  One  question  Is,  "Why  hasn't 
It  been  done  sooner?"  Die  answer  lies  In  two  factors: 

(1)  COBOL  Is  a "real"  language  (l.e. , designed  for  and  used  by 

a large  community  of  users).  Verification  has  only  recently 
been  applied  to  real  languages,  because  of  the  relative  new- 
ness of  verification  and  because  of  the  great  complexity  of 
real  languages. 

(2)  Verification  has,  up  to  now,  been  practiced  mainly  by 
academicians,  and  academicians  have  a known  distaste  for 
COBOL. 

Ihis  project  Is  mainly  a feasibility  study,  with  some  research  and 
proof  of  concept.  Once  the  major  Issues  In  COBOL  verification  are  deter- 
mined, we  Intend  to  Illustrate  what  It  means  to  verify  a COBOL  program 
(on  a very  small  scale).  “Hie  research  Involved  Is  Intended  to  extend 
current  verification  techniques  to  make  them  applicable  to  COBOL.  This 
report  Is  devoted  mainly  to  a discussion  of  Issues,  and  to  a description 
of  the  techniques  that  we  are  developing. 

The  body  of  the  report  contains  general  motivational  material 
describing  the  theory,  observations,  and  general  approach  of  the  project. 
The  three  appendices  contain  descriptions  of  the  particular  results  of 
the  project  so  far— the  syntax  and  semantics  of  the  COBOL  subset  for 
verification. 

2.  Program  Verification  - Theory 

The  idea  of  program  verification  goes  back  as  far  as  programming 
Itself:  It  was  first  discussed  by  von  Neumann  and  Goldstlne  (1).  The 
basic  Idea  Is  that  there  Is  a state  that  models  some  external  phenomenon 
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(o.K. , differential  equations,  matrices,  payroll  records).  Hie  state 
can  be  represented  by  core  memory,  the  contents  of  files,  or  program 
variables  (at  a more  abstract  level),  TTiere  is  also  a set  of  elementary 
operations  that  change  the  state.  Examples  of  elementary  operations 
are  machine  Instructions  or  statements  in  higher-level  programming 
languages,  A program  defines  a (possibly  Infinite)  set  of  sequences 
of  elementary  operations.  When  a program  is  executed,  only  one  se- 
quence of  elementary  operations  Is  iierformed.  Hie  selection  of  one 
sequence  out  of  the  set  of  sequences  defined  by  the  program  Is  deter- 
mined by  the  state  Just  before  the  program  is  executed  (i,e, , the 
initial  state).  Hius  a program  is  a function  from  states  to  sequences 
of  operations.  If  the  program  terminates,  the  state  Just  alter  termi- 
nation is  called  the  final  state. 

Hie  user  of  a program  is  Interested  in  knowing,  for  a given 
Initial  state  of  the  program,  what  the  final  state  will  be.  Often  he 
will  have  a specification,  which  is  a mapping  from  initial  states  to 
final  states.  It  is  not  immediately  obvious  wliether  a program  (a  mapping 
from  states  to  sequences  of  operations)  and  a specification  (a  mapping 
from  states  to  states)  are  consistent.  Consistency  between  a specifica- 
tion and  a program  is  often  called  program  correctness.  Program  verifi- 
cation is  a set  of  techniques  for  proving  this  consistency.  Floyd  (2) 
first  described  this  method  of  verification.  The  specification  consists 
of  a statement  of  the  properties  that  the  initial  state  must  have  (the 
input  assertion),  and  a statement  of  the  relation  between  the  initial 
state  and  the  final  state  (the  output  assertion).  Both  Input  and  output 
assertions  are  stated  as  predicates. 

Hie  effects  of  each  of  the  elementary  operations  on  the  state  must 
also  be  formally  described  (input  and  output  assertions  for  these  opera- 
tions are  useful  as  well).  Hie  control  operations,  which  do  not  In 
ttemselves  affect  the  state,  must  also  be  axlomatlzed.  Since  a program 
may,  in  a small  number  of  statements,  describe  a large  (possibly 
Infinite)  sequence  of  operations.  Inductive  assertions  must  be  associated 
with  each  of  the  loops  of  the  program. 

Floyd's  method  Is  used  for  proving  partial  correctness  of  programs. 

A partially  correct  program  Is  consistent  with  Its  assertions  only  If 
It  terminates.  Termination  of  a program  can  be  proved  separately. 

Given  Input  and  output  assertions,  program  text  (with  Inductive  assertions), 


and  the  definition  of  the  elementary  operations,  a formula  In  first  order 
logic  can  be  constructed  whose  validity  Is  equivalent  to  the  partial  cor- 
rectness of  the  program.  This  formula  Is  called  a verification  condition 
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A software  system  that  accepts  as  Input  the  program  to  be  verified  (with 
Input,  output,  and  Inductive  assertions)  Is  called  a verification  condition 
generator  (3,4).  Verification  conditions  can  be  proved  by  hand,  or  can 
serve  as  Input  to  a deductive  system,  or  automatic  theorem  prover,  which 
attempts  to  generate  a proof.  Most  deductive  systems  are  Inadequate  for 
proving  verification  conditions  by  completely  automatic  means,  and  many 
systems  are  equipped  with  Interactive  facilities  to  allow  users  to  guide  the 
proof.  Deductive  systems  with  Interactive  facilities  are  also  called  semi- 
automatic verification  systems. 

The  application  of  formal  techniques  to  a particular  programming  language 
environment  Is  often  a matter  of  style.  The  verification  condition  generator 
Incorporates  most  of  the  language-dependent  features,  because  It  must  trans- 
late statements  In  the  programnlng  language  Into  expressions  In  predicate 
calculus.  Some  verification  condition  generators  are  based  on  a particular 
semantic  description  of  a language.  A verification  condition  generator  for 
PASCAL  (London,  Luckham,  and  Igarashl,  4)  Is  based  on  the  axiomatic  des- 
cription of  PASCAL  by  Hoare  and  Worth  (5). 

A verification  condition  generator  axlomatlzes  the  control  structures 
of  the  language,  but  properties  of  the  data  types  of  a language  are  often 
too  complex  to  be  Incorporated  Into  the  verification  conditions  themselves. 
Verification  conditions,  especially  In  a high-level  language,  typically  con- 
tain references  to  functions  that  axlomatlze  the  data  types  of  the  language. 
The  deductive  system  can  prove  formulae  containing  these  functions  either  by 
Invoking  their  definitions  (If  supplied)  or  by  applying  axioms  (or  high-level 
rules  of  inference)  to  make  deductions.  The  first  method  works  well  for 
primitive  recursive  functions  (Boyer  and  Moore,  6)  but  Is  extremely  Ineffic- 
ient for  B»re  complex  domains.  Most  verification  systems.  Including  the  SRI 
system  (Elspas  at  al.,  3;  Waldlnger  and  Levitt,  7),  use  the  second  method. 
However,  In  this  method  all  proofs  may  not  be  trusted  If  the  axioms  are  wrong. 
One  approach  to  this  problem  la  to  use  high-level  rules  of  Inference  to  find 
a proof,  and  to  check  Its  validity  using  definitions  and  a proof  checker 
(Boyer  et  al.,  8).  The  proof  checker  would  be  used  to  substantiate  the 
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validity  of  any  instantiation  of  an  axiom  that  is  actually  used  in  a proof. 

This  may  be  easier  than  proving  the  most  general  form  of  the  axiom  from  the 
def ini tions. 

There  are  several  areas  that  have  not  boon  addressed  by  the  mainstream 
of  program  verification.  The  first  is  termination.  This  issue  has  been 
addressed  by  several  researchers  (3,9,10),  and  can  be  treated  either  together 
with  or  separately  from  the  issue  of  partial  correctness.  Two  other  issues, 
run-time  errors  and  validity  of  input  data,  are  also  Important  to  formalize 
if  verification  is  to  lead  to  software  reliability.  All  three  of  these 
Issues  have  been  grouped,  to  some  extent,  into  a property  called  clean 
termination  (Sites,  11).  Although  these  issues  are  important,  they  will  not 
bo  considered  during  this  contract,  which  must  limit  itself  to  the  basic 
issues  of  partial  correctness  for  COBOL  programs. 

3 . Program  Verification  for  Real  Languages 

COBOL  is  a member  of  the  set  of  "real"  programming  languages,  l.e., 
those  that  are  widely  used  in  many  applications  and  for  which  standards 
exist.  Real  lang\xagos  are  usually,  but  not  always,  commercially  viable 
products.  Examples  of  real  languages  are  COBOL,  FORTRAN,  PL/1 , and  (to  a 
lesser  extent)  Algol  and  LISP.  Tlic  properties  that  make  a programming  lan- 
guage a real  language  unfortunately  also  serve  to  detract  from  the  ease  of 
verifying  programs  in  that  language.  Most  of  these  undesirable  properties 
can  be  summed  up  under  the  term  "lack  of  semantic  cleanliness." 

A language  has  a "clean"  semantics  if  the  definition  of  the  language  is 
elegantly  expressible  in  some  formal  medium.  There  are  many  good  reasons 
why  real  languages  arc  not  semantically  clean.  The  first  reason  is  the  size 
of  the  language.  A real  language  is  the  incorporation  of  the  special  interests 
of  many  groups  of  users,  whose  interests  are  not  always  compatible.  The  result 
is  often  that  large  numbers  of  features  are  added  on.  The  addition  of  these 
features  not  only  complicates  the  semantics  of  the  language,  but  often  violates 
the  spirit  that  motivatc'd  the  initial  conception  of  the  language.  PL/1  Is  a 
good  exanple  of  this  tendency.  In  a desire  to  overcome  some  of  the  difficulties 
of  FORTRAN,  COBOL,  and  Algol,  the  designers  of  PL/1  created  something  larger 
than  any  of  its  ancestors.  Considered  alone,  the  size  of  real  languages  Is  a 
major  obstacle  to  verification.  Second,  most  real  languages  must  concede 
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syntactic  generality  In  the  Interests  of  a fast  Implementation,  either  In 
the  compiler  or  the  generated  code.  Examples  of  these  dependencies  are 
limitations  In  the  number  of  nestings  (COBOL)  or  In  the  complexity  of  an 
arithmetic  expression  In  certain  places  (FORTRAN).  Lack  of  syntactic  gen- 
erality makes  the  syntactic  analysis  phase  of  the  verification  system  more 
difficult  to  Implement.  Third,  most  languages  must  have  some  features  that 
deal  with  the  hardware  or  operating  system.  The  environment  division  and 
communication  module  of  COBOL  are  examples  of  these  features.  Standardi- 
zation has  served  to  make  a uniform  Interface  between  the  language  and  the 
environment.  However,  the  fact  that  a variable  Is  SYNCHROMZED  or  that  there 
are  100  logical  records  In  a block  will  not  affect  the  correctness  of  a COBOL 
program,  but  may  affect  the  performance  of  that  program.  Fourth,  most  real 
languages  are  the  products  of  an  evolving  development,  as  Illustrated  by  the 
fact  that  many  real  languages  have  numbers  after  their  names  to  Indicate  the 
particular  dialect  In  the  sequence  (FORTRAN  IV,  Algol  60,  LISP  1.5).  In 
many  cases,  there  Is  a desire  for  upward  compatibility,  so  that  bad  features 
that  could  have  been  eliminated  remain — "augumented"  by  the  Improvements. 

Another  aspect  of  this  problem  Is  that  most  of  the  currently  Important  lan- 
guages got  their  start  before  the  aesthetics  of  prograimnlng  were  well  estab- 
lished. Thus,  many  real  languages  lack  features  such  as  strong  typing,  block 
stimcture,  and  flexible  procedure  and  macro  facilities.  Sturctured  programming 
practices  are  motivated  by  a desire  to  Infuse  these  new  aesthetics  Into  the 
programming  world.  Perhaps  verification  will  generate  Its  own  set  of  aesthetics 
by  which  the  design  of  future  programming  languages  will  be  guided.  Lastly 
there  Is  the  problem  that  even  If  the  semantics  of  a real  language  Is 
clean,  they  are  usually  stated  In  natural  language  In  a standards  manual  (12). 

A standards  manual  may  be  all  right  for  programmers  and  language  Implementers, 
but  It  Is  certainly  difficult  for  verification.  If  the  standards  people  had 
some  clean  vision  of  a language  In  mind,  they  should  have  written  down  the 
formal  semantics  somewhere.  The  formal  definition  of  PL/1  Is  such  an  attempt. 
The  length  of  the  formal  definition  of  PL/1  Is  a commentary  on  our  tools  for 
specifying  prograimnlng  language  semantics  (e.g. , VU.)  and  on  the  Inherent 
semantic  complexity  of  real  languages. 

Before  solutions  to  these  problems  are  considered,  there  Is  one  major 
constraint  to  these  solutions:  the  solutions  must  have  minimum  effect  on 
the  languages  themselves.  Manufacturers  do  not  want  to  rewrite  their  compilers. 


6 


and  users  do  not  want  to  rewrite  their  programs.  Thus,  the  solution  to  the 
verification  problem  for  real  languages  must  be  incremental.  Research  in 
new  languages  that  support  verification  is  very  important,  but  the  data  pro- 
cessing conmunity  will  ignore  this  research  unless  verification  can  be  shown 
useful  on  a more  immediate  basis. 

The  problem  of  language  size  has  two  aspects,  syntactic,  and  semantic. 

When  a language  has  syntactic  complexity,  there  are  many  different  ways  to 
do  the  same  thing.  When  a language  has  semantic  complexity,  there  are  many 
things  that  can  be  done.  In  cases  where  there  exists  more  syntactic  complex- 
ity than  semantic,  verification  can  be  done  on  a program  written  in  an  internal 
form  which  is  s>'ntactically  simple,  i.e.,  there  is  only  one  way  to  do  any  given 
thing.  Automatic  translation  from  the  external  form  to  the  internal  form  is 
relatively  straightforward.  Semantic  complexity  is  handled  primarily  by  sub- 
setting,  which  involves  clioosing  a sublanguage  that  permits  only  the  desired 
semantic  features.  Real  languages  differ  in  the  extent  to  which  subsets  can 
be  generated  for  them.  If  a language  construct  is  necessary,  but  also  seman- 
tically messy,  there  will  be  trouble  in  doing  subsetting.  This  is  precisely 
the  trouble  with  the  jjo  _t^.  It  is  clearly  necessary  in  languages  like  FORTRAN 
and  COBOL,  but  also  permits  the  writing  of  programs  with  very  messy  control 
structures.  The  solution  to  this  typo  of  problem  takes  several  forms: 

(1)  Try  to  change  the  language. 

(2)  Establish  management  techniques  to  prevent  abuse  of 
the  consti-uct. 

(3)  Develop  a preprocessor  for  the  language,  which  permits 
desirable  constructs  in  place  of  harmful  ones. 

It  is  the  goal  of  this  project  to  propose  a subset  of  COBOL  such  that  pre- 
processing need  not  take  place.  For  more  information,  see  Section  4,  on 
structured  programming  and  COBOL. 

In  the  case  of  sacrificing  syntactic  generality  for  the  speed  of  the 
compiler  or  the  generated  code,  it  is  desirable  to  allow  the  verification 
system  to  process  a language  with  more  syntactic  generality.  To  prevent  the 
successful  verification  of  programs  that  will  not  oven  compile,  one  then 
requires  that  all  programs  be  run  through  the  syntactic  analysis  phase  of 
the  compiler  before  verification.  Thus  the  compiler  can  chock  the  special 
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cases  of  the  language,  allowing  the  verification  system's  parser  to  be  simpler 
(see  Appendix  I).  Such  a decision  Is  made  in  this  effort. 

With  regard  to  the  features  of  a real  language  that  are  dependent  on  the 
hardware  or  the  operating  system,  there  are  two  strategies:  either  to  axlom- 
atlze  them  or  Ignore  them.  Statements  in  (X)BOL's  EIN VIRCH^MD^T  DIVISION,  and 
items  like  SYNCHRC*«  IZED  or  the  number  of  logical  records  per  block,  can  be 
Ignored  since  they  do  not  affect  the  outcome  of  the  program.  Special  kinds 
of  file  I/O  and  communication  with  the  operating  system  can  be  axiomatized 
as  properties  of  the  abstract  machine  on  which  a program  runs.  The  formal 
definition  of  a programming  language  involves  specifying  the  instruction  set 
of  an  abstract  machine  that  runs  the  program,  and  specifying  the  interpreter 
that  runs  the  program  on  the  abstract  machine. 

The  technology  of  program  verification  has  ignored  several  issues  that 
are  essential  in  the  verification  of  programs  written  in  real  languages.  One 
reason  for  this  phenomenon  is  that  researchers  in  program  verification  are 
still  having  difficulty  in  applying  program  verification  to  toy  languages 
(partly  because  verification  is  a comparatively  new  technique  and  because  it 
is  an  extremely  difficult  one).  These  problems  are  also  difficult  in  them- 
selves. Among  the  problems  are: 

(1)  Finite  machine  arithmetic. 

(2)  Clean  termination  and  run-time  errors. 

(3)  Validity  of  input  data. 

The  issue  of  finite  machine  arithmetic  is  particularly  acute  in  COBOL 
because  data  items  have  no  more  digits  than  they  need  for  internal  storage, 
while  other  languages  have  the  (relatively  large)  word  size  of  the  machine. 
Thus,  overflow  and  truncation  occur  more  often.  Consideration  of  these 
items  will  appear  in  the  section  dealing  with  the  semantics  of  COBOL  data 
i terns . 

Clean  termination  has  been  described  In  an  earlier  section.  Because 
of  the  limited  scope  of  this  project  this  issue  will  not  be  dealt  with  at 
this  time.  Clean  termination  assumes  the  absence  of  run-time  errors.  However, 
such  assumptions  cannot  always  be  made,  as  is  the  case  in  hardware  and  oper- 
ating system  errors  and  in  situations  where  input  data  is  invalid  (see  below). 
At  some  point  such  possibilities  should  be  considered  in  efforts  to  verify 
programs  in  real  language. 


In  vorlflcalion  the  asumptlon  is  made  that  input  data  is  valid  (with 
respect  to  type,  ranne  of  values,  etc.).  One  of  the  greatest  difficulties 
in  assuming  the  reliability  of  programs  in  real  languages  is  that  such 
assumptions  cannot  be  made.  In  other  words  it  is  a frequent  occurrence 
that  input  data  are  faulty,  and  programs  must  be  written  to  account  for 
such  situations.  A real  program  will  typically  have  several  degraded  modes 
of  performance  (without  blowing  up),  depending  on  the  severity  of  the  error. 
Thus  even  if  a single  record  is  messed  up,  all  other  records  may  be  pro- 
cessed correctly.  There  is  a need  in  program  verification  to  anticipate 
such  occurrences  and  to  make  the  input  assertions  for  these  pref^.iams  as 
weak  as  pKissiblo. 

1 . Structured  Programming  and  COBOL 

There  is  a growing  interest  in  various  techniques  for  increasing  the 
"well-structuredness"  of  COBOL  programs.  This  section  discusses  their 
impact  on  verification.  The  techniques  fall  into  the  categories  of  pre- 
processors and  restrictions  on  the  way  in  which  (X)BOL  programs  arc  written. 

The  intent  of  these  techniques  is  to  simulate  a block-structured  lang- 
uage, in  which  control  is  nested.  This  kind  of  structure  within  a program 
makes  the  program  easier  to  understand  and  debug.  The  conclusions  are  less 
certain  for  proof. 

Let  us  first  examine  the  preprocessors.  Instead  of  the  jjo  _t^,  they 
offer  a set  of  control  primitives  such  as  do... while,  if ...  then. . .else,  case, 
and  others.  The  intent  is  that  such  well-behaved  control  structures  are 
easier  to  axiomatize  than  the  jjo  and  thus  it  would  be  easier  to  prove 
programs  using  only  those  constructs.  This  was  the  belief  of  Hoare  (5),  in 
his  axiomatization  of  PASCAL.  However,  Knuth  (13)  in  his  paper  on  structured 
programming  with  the  jjo  rep)orts  that  £o  to ' s are  surprisingly  easy  to 
axiomatize  (see  the  appendix  on  the  semantics  of  the  COBOL  subset).  It  is 
only  necessary  to  put  assertions  at  each  label  (and  at  PERFORM  loops).  Thus, 
there  is  no  decrease  per  se.  in  the  complexity  of  verification  when  ^ to ' s 
are  removed  from  the  language. 

However,  sturctured  programming  was  intended  to  limit  the  complexity 
of  the  programs  being  written  by  reducing  the  average  number  of  control  paths 
per  line  of  code.  Structured  programming  also  reduces  the  number  of  patterns 
of  control  paths  by  forcing  the  paths  to  be  nested.  Since  the  complexity  of 


Floyd  verification  is  roughly  proportional  to  the  number  of  paths,  it  seems 
that  (on  the  average)  structured  programs — whether  written  by  preprocessor 
or  by  management  fiat — are  easier  to  verify  than  unstructured  ones. 
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There  is  another  sense  in  which  the  term  "structured  programming"  is 
applicable.  Structure  can  be  gained  by  breaking  large  programs  up  into 
small,  loosely  coupled  pieces.  This  is  the  modularity  concept  of  Pamas  (4), 
in  which  the  change  in  a single  design  decision  affects  only  one  module. 
Unfortunately,  the  COBOL  language  Itself  does  not  provide  facilities  (such 
as  flexible  procedure  calls  or  macros)  for  accomplishing  this  goal.  In 
many  cases  management  techniques  are  used  to  break  up  a large  programming 
project  into  small,  manageable  pieces.  One  of  the  methods  for  accomplishing 
modularity  is  to  hide  the  format  of  data  structures  within  a single  module. 

Since  data  structures  (l.e.,  shared  files)  are  precisely  the  means  by  which 
CX)BOL  programs  communicate,  the  format  Information  for  the  data  structures 
tends  to  be  scattered  over  many  programs.  Thus,  a change  in  file  formats 
may  require  a lot  of  reprogramming,  more  than  might  be  necessary  if  con- 
cepts of  modularity  were  more  visible  in  (X)BOL. 

Decomposing  a program  into  hierarchical  levels  of  abstraction  has  been 
suggested  (Uijkstra,  15)  as  a means  for  handling  program  complexity.  Recently 
Robinson  and  Levitt  (16)  have  proposed  a method  for  formalizing  a level  of 
abstraction  in  a self-contained  way,  and  for  decomposing  the  proof  of  the 
large  program  into  many  small  independent  proofs,  one  for  each  level  of  abstrac- 
tion. The  applicability  of  this  work  to  COBOL  is  perhaps  a long  way  off,  because 
the  hierarchical  method  depends  strongly  on  the  notion  of  data  abstraction. 

COBOL  programs  do  not  seem  to  have  data  structures  that  can  be  abstracted  very 
easily.  In  spite  of  the  tree-structured  data  in  COBOL  programs,  all  data 
structures  seem  to  have  one  level  of  detail  that  is  not  hidden  from  parts  of 
the  program.  Abstraction  would  not  in  this  case  lead  to  simple  programs  at 
higher  levels.  However,  the  problem  boars  further  study. 

5.  Discussion  of  the  COBOL  Language 

(X>BOL  is  a language  of  fairly  simple  control,  but  its  data  structures 
and  operations  arc  rich.  The  area  of  most  immediate  concern  for  verification 
is  the  elementary  data  item.  All  computation  in  CX>BOL  is  character-oriented. 
Numeric  data  items  have  pictures  and  sizes.  Arithmetic  operations  must  con- 
sider truncation  and  overflow  with  almost  every  operation.  Even  without  the 
primitives  STRING  and  UNSTRING,  the  manipulation  of  strings  is  inherent  in 
each  operation. 
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Tho  most  important  feature  of  an  elementary  numeric  data  item  is  its 
PICTURE,  a specification  of  how  it  would  look  if  it  were  printed  out. 

For  example,  a picture  specification  of  999  would  print  out  a three-digit 
integer.  The  sign  and  decimal  point  information  are  also  included  in  the 
specification.  Although  the  decimal  point  in  numeric  items  is  implicit 
(remembered  by  the  system  but  not  stored  with  tho  item),  the  sign  (if  present) 
is  encoded  in  one  of  the  digits  of  the  stored  data  item.  A great  deal  of 
string  processing  can  be  performed  by  a simple  assignment  operation,  because 
of  the  editing  feature.  There  is  a special  type  of  data  item  called  numeric 
ed i ted , whose  picture  specification  can  contain  additional  information  con- 
cerning inserted  characters,  zero  suppresion,  sign  printing,  and  currency 
symbols.  For  example,  a data  item  with  a picture  specification  of  $$$,$$$.99 
would  print  out  $10,000.00  when  its  contents  are  10000  and  $5.63  when  its 
contents  are  5.63.  Notice  that  the  comma  disappears  and  the  dollar  sign  moves 
over  when  the  value  of  the  item  decreases.  These  features  can  be  used  to 
generate  fancy  reports,  and  can  also  create  complexities  with  regard  to  veri- 
fication. The  editing  must  be  axiomatized,  and  functions  must  be  added  to 
the  assertion  language  in  order  to  state  properties  of  numeric  edited  data 
items.  There  are  many  data  features  that  are  not  axiomatized  by  the  subset 
provided  by  this  project,  e.g.,  string  processing,  table  handling,  sorting, 
and  overlays. 

CX)U0L  is  a language  of  input-output.  There  is  sequential,  random.  Indexed, 
and  console  1^0.  Any  verification  system  that  deals  with  C0IK)L  must  handle 
I/O  to  some  extent.  This  subset  will  handle  console  I/O  and  some  very  simpli- 
fied versions  of  sequential  I/O,  enough  to  verify  some  elementary  programs. 

We  are  using  a method  similar  to  Hoare's  axiomatization  of  I/O  in  PASCAL  (5). 

In  it,  a file  is  a sequence  of  values  for  the  set  of  variables  that  constitute 
the  input  or  output  rocoixl.  Each  file  has  a pointer  that  designates  the  cur- 
rent record.  Reading  the  file  simply  moves  the  pointer,  while  writing  the 
file  adds  to  the  sequence  and  changes  the  pointer  as  well. 

The  records  in  (X)D0L  are  tree-structured,  an  attribute  which  presents 
a naming  problem.  Several  elementary  items  may  have  the  same  local  name, 
with  tho  ambiguity  resolved  by  different  qualification  statements.  The 
CORRESPOMIING  option  makes  use  of  this  feature.  A (X>B0L  verification  system 
must  incorporate  the  same  naming  mechanism  that  COBOL  uses. 
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Several  data  features  not  Incorporated  In  the  subset  are  the  REDEFINES 
and  RB4AMES  options.  REDEFINES  allows  a data  Item,  either  group  or  elemen- 
tary, to  have  a different  name  and  a different  definition  (l.e.,  set  of 
picture  specifications).  It  Is  like  FORTRAN'S  COMMCW  statement,  except  that 
the  sharing  Is  done  within  one  program.  ROMANES  allows  the  renaming  of  a 
sequence  of  elementary  data  Items,  but  the  same  pictures  arc  retained.  It  Is 
analogous  to  the  FORTRAN  EQUIVALENCE  statement.  The  REDEFINES  option  Is  much 
more  difficult  to  handle,  since  It  Involves  representation  decisions  In  the 
machine,  e.g. , the  number  of  characters  contained  in  a group  or  elementary 
data  Item.  These  decisions  also  involve  alignment  and  word  boundaries, 
factors  which  vary  depending  on  the  implementation  machine. 

6.  COBOL  Subset  for  Verification 

We  have  carefully  examined  the  syntax  and  semantics  of  the  COBOL  lang- 
uage as  defined  by  (12),  and  have  arrived  at  a subset  suitable  for  verification 
according  to  the  criteria  described  in  the  previous  sections  of  this  rep>ort. 

The  results  of  this  research  are  described  in  the  Appendices  I,  II,  and  III. 
Appendix  I describes  the  syntax  of  the  PROCEDURE  DIVISKH  for  the  (X)BOL  subset, 
the  method  (transduction  grammars)  for  describing  such  syntax,  the  software 
system  for  manipulating  these  grammars,  and  the  parsing  program  that  uses  them. 
Appendix  II  describes  the  syntax  of  the  DATA  DIVISION.  Not  all  of  the  decisions 
have  been  made  concerning  the  transductions  for  names  and  pictures  of  data  Items, 
so  that  the  transductions  are  left  out.  Appendix  III  contains  a discussion  of 
the  issues  Involved  In  the  description  of  the  semantics  of  COBOL  statements 
and  data  types.  This  Is  a difficult  problem,  perhaps  the  most  difficult  of 
the  project;  only  preliminary  results  have  been  shown  here. 

If  one  were  to  examine  a list  of  the  primitives  that  have  been  eliminated 
from  the  COBOL  subset  for  verification,  they  could  have  been  eliminated  for 
one  of  two  reasons : 

(1)  A primitive  was  considered  to  be  undesirable  for  the 
purposes  of  verification. 

(2)  A primitive  was  considered  to  be  reasonable  for  veri- 
fication, but  was  not  deemed  "essential."  Thus  it  was 
eliminated  from  this  subset,  which  had  to  be  kept  small. 
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Very  few  constructs  have  been  eliminated  from  the  language  for  reason  (1): 
the  ALTER  statement,  the  "abbrevlatcxi  combined  conditional"  relational  expres- 
sion, the  MOVT  statement  between  group  data  Items,  and  the  REDEFINES  and 
RQ«AMES  statements.  Even  these  features  could  be  axlomatlzed,  but  with  great 
d If f Icul ty . 

The  method  for  representing  the  COBOL  grammar  In  the  verification  system 
Is  designed  to  allow  extensions  to  the  language  at  any  time.  It  Is  predicted 
that  further  work  In  the  project  will  call  for  the  enlargement  of  the  subset 
of  COBOL  handled  by  the  verification  system. 

7,  Assertion  Language  and  Rules  of  Inference 

The  object  of  the  assertion  language  Is  to  allow  a COBOL  programmer  to 
state  any  property  of  a COBOL  program  In  an  elegant  way.  This  Involves 
experimentation  with  many  different  COBOL  programs  to  see  what  must  be  said 
and  how  to  say  it.  At  this  time  the  assertion  language  design  is  in  its  very 
preliminary  stages,  and  this  section  is  a set  of  general  guidelines  that  will 
motivate  the  final  assertion  language  design. 

Formulae  in  the  assertion  language  must  be  handled  by  a general  theorem- 
proving program,  so  that  the  syntactic  basis  for  any  assertion  language  must 
be  first-order  logic.  The  assertion  language  must  deal  with  numeric  quantities, 
so  that  arithmetic  operators  and  relations  are  also  Included.  Although  sets 
do  not  occur  In  COBOL,  they  are  useful  in  aggregating  a multiplicity  of  Items 
In  assertions.  Sequences  appear  in  the  axlomatlzation  of  files  and  strings, 
and  are  an  otherwise  useful  structure.  These  general  features  should  occur 
In  any  assertion  language. 

Instead  of  augmenting  the  syntax  of  the  assertion  language  by  adding 
language-dependent  constructs.  It  is  useful  just  to  use  functions  and  predicates 
to  define  these  constructs.  In  order  to  perform  deductions,  axioms  and  defi- 
nitions are  used  to  describe  properties  of  the  functions  and  predicates.  Axioms 
constitute  high-level  rules  of  inference  and  definitions  can  be  viewed  as  sub- 
stitution rules. 

The  particular  functions  used  to  describe  the  properties  of  COBOL  are 
Interesting.  They  fall  Into  one  of  four  categories: 


(1)  Type  Information.  These  functions  tell  whether  an  alphanumeric 
data  Item  contains  alphabetic  or  numeric  data  at  a given  time. 

(2)  Values  of  data  Items.  Each  numeric  data  Item  has  a numeric 
value  (real  or  integer)  and  a print  value  (character  string). 

(3)  Naming  Information.  The  semantics  of  some  COBOL  operations 
depend  on  the  data  names  above  and  below  a data  Item  In  the 
tree-structured  data  definition. 

(4)  Operations  on  data  1 terns . Truncation,  rounding,  and  editing 
of  data  Items  require  special  functions. 

The  enumeration  and  definition  of  these  predicates  and  functions  Is 
now  In  progress. 

8.  Structure  of  Proposed  Verification  System 

In  our  view  of  the  problems  of  verification  In  real  languages,  we 
actually  require  the  assistance  of  the  compiler  In  the  verification  process. 

In  addition,  large  parts  of  the  verifier  are  table  driven,  so  that  certain 
changes  In  the  COBOL  subset  will  have  a minimal  effect  on  the  programs  com- 
prising the  verification  system. 

The  proposed  verification  system  is  shown  In  Figure  1.  In  it,  systems 
or  processes  (l.e.,  parts  of  the  verification  system)  are  denoted  by  ovals 
or  circles.  Documents  or  programs  (l.e.,  the  data  that  is  processed  by  the 
verification  system)  are  represented  by  rectangles.  Knowledge  encoded  In 
system  tables  Is  represented  by  diamonds. 

A program  Is  first  compiled  by  a standard  COBOL  compiler  to  check  for 
syntax  errors.  Then  user-supplied  assertions  are  added  to  the  program  text, 
and  the  combined  argument  Is  fed  to  the  parser  of  the  verification  system. 

Using  the  syntactic  specifications  for  the  language  (the  transduction  grammar), 
the  parser  creates  an  internal  form  for  the  COBOL  program.  The  verification 
cond 1 t Ion  generator  takes  the  program  In  Internal  form  and  (using  Its  knowledge 
of  (X)BOL  operations)  produces  the  verification  conditions.  The  verification 
conditions  are  then  fed  to  the  Interactive  deductive  system,  which  attempts 
to  produce  a proof  of  the  verification  conditions  (with  the  help  of  a human). 

The  scope  of  this  project  calls  for  the  programming  of  the  parser  and 
verification  condition  generator.  However,  the  most  difficult  Issues  are 
involved  In  deciding  formal  representation  media  for  the  Items  in  the  three 
diasamds,  and  for  enco>'lng  the  (X)BOL  syntax  and  semantics  using  the  represen- 
tation media. 
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The  system  is  being  implemented  on  the  PDP-10  at  the  Artificial 
Intelligence  Center  at  Stanford  Research  Institute,  using  the  INTERLISP 
programming  environment.  The  system  provides  sophisticated  interactive 
facilities  for  all  phases  of  the  programming  process.  The  SRI  facility 
is  accessible  through  the  ARPANET  (address  SRI-AI).  Much  of  the  docu- 
mentation for  the  project  is  on-line  at  the  same  facility. 

9.  Conclusions 

It  is  our  feeling  that  wo  have  uncovered  some  very  interesting  areas 
of  study,  and  that  COBOL  verification  is  feasible  and  challenging.  The 
level  of  effort  does  not  permit  as  deep  an  examination  of  some  of  the 
issues  as  wc  had  hoped,  but  this  research  provides  a basis  for  further  work. 

The  current  status  of  the  project  can  be  summed  up  as  follows: 

(1)  We  have  a thorough  knowledge  of  the  general  issues  of 
COBOL  verification. 

(2)  We  have  decided  on  the  syntax  of  the  COBOL  subset  but 
have  not  yet  finished  axiomatizing  it.  However,  a 
substantial  amount  of  work  has  already  been  done. 

(3)  The  parser  has  been  written,  and  the  verification 
condition  generator  has  been  sketched  out. 

(4)  The  documentation  is  adequate  and  up  to  date. 

(5)  Some  sample  COBOL  programs  have  been  studied,  and  assertions 
for  them  have  tjeen  written. 

(6)  Except  for  the  exact  choice  of  auxiliary  functions,  the 
assertion  language  has  been  designed. 

The  following  tasks  remain  to  be  done: 

(1)  Completion  of  semantic  axiomatlzation,  the  choice  of  functions 
of  the  assertion  language,  and  the  rules  of  inference  for  the 
functions.  These  tasks  are  all  related. 

(2)  Implementation  of  the  verification  condition  generator. 

Given  the  completion  of  task  (1),  this  is  a relatively 
straightforward  programnlng  effort. 

(3)  More  work  on  examples — both  In  writing  assertions  and  generating 
hand-proofs.  We  will  devote  our  attention  to  programs  containing 
about  15-50  linos  of  PROCEDURE  DIVISION. 
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(4)  Completion  of  the  study  of  structuring  methods  (Including 
hierarchical  methods)  as  applied  to  CXIOOL  verification. 

At  this  time  such  efforts  do  not  seem  so  fruitful  as  they 
did  earlier.  Perhaps  we  will  have  to  devise  slightly  new 
techniques  for  partitioning  the  proofs  of  COBOL  programs. 

Several  observations  may  be  made: 

(1)  COBOL  Is  an  Interesting  language  and  Is  well  designed. 

(2)  Structure  and  abstraction  are  not  as  promising  as 
originally  anticipated  (see  4 above). 

(3)  We  have  been  able  to  bring  a surprising  amount  of  tech- 
nology to  bear  on  the  problems  encountered. 

The  following  problems  either  exist  now  or  are  anticipated: 

(1)  With  the  functions  for  editing  and  truncation,  verifi- 
cation conditions  may  be  longer  than  originally  anticipated. 

(2)  It  Is  taking  more  time  than  originally  anticipated  to  arrive 
at  a formal  statement  of  COBOL  semantics. 

The  following  Issues,  although  they  will  not  be  covered  in  the  current 
effort,  are  important  and  deserve  to  be  studied  in  future  projects. 

(1)  Clean  termination  of  COBOL  programs. 

(2)  Graceful  degradation  In  the  presence  of  Invalid  data. 

(3)  The  application  of  verification  techniques  to  other 
areas  related  to  the  reliability  of  COBOL  programs — 
e.g. , testing,  symbolic  evaluation,  and  debugging. 

It  certainly  seems  as  though  the  verification  of  COBOL  programs  Is 
possible,  and  eventually  may  become  cost  effective. 
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1.  Introduction 

We  Intend  to  use  a table-driven  language  transducer  for 
Initial  processing  of  COBOL  programs  that  are  to  be 
verified.  Syntax  transduction  is  the  process  of  translating 
an  Input  program  from  the  standard  form  In  which  COBOL 
programs  are  written  by  users  of  the  language  to  an  abstract 
form  with  the  same  semantic  Import  but  with  a uniform 
structure  easily  manipulated  by  a verification  condition 
generator  (the  next  phase  of  verification).  Such  a 
procedure  Is  especially  helpful  In  dealing  with  COBOL:  this 
language  has  extensive  syntactic  complexities  that  often  do 
not  correspond  to  comparable  semantic  complexities.  The 
point  of  the  syntactic  complexity  of  the  language  Is  to 
permit  programmers  to  write  In  an  expressive  and  natural 
format.  While  such  a format  Is  quite  suitable  for  human 
consumption,  it  Is  Inappropriate  for  the  sorts  of  machine 
manipulation  needed  In  verification,  and  It  Is  consequently 
beneficial  to  translate  to  the  syntactically  much  simpler 
abstract  form  that  we  have  devised. 

The  correspondence  between  COBOL  and  Abstract  COBOL  Is 
specified  by  a transduction  grammar . Such  a grammar 
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consists  of  a set  of  BNF  productions  to  describe  the  COBOL 
language,  and  a corresponding  transduction  for  each 
production.  The  transduction  is  a LISP  program  which 
computes  the  abstract  form  of  the  language  fragment 
specified  by  the  associated  production.  Thus  we  translate  a 
COBOL  program  to  abstract  form  by  using  a parser  to  analyze 
a valid  program  into  a 'parse  tree'  according  to  the 
productions  of  the  grammar,  and  then  process  the  parse  tree 
from  bottom  to  top  using  transductions  to  obtain  the  parts 
of  the  desired  Abstract  COBOL  program. 

Our  transduction  grammar  for  a substantial  subset  of 
the  constructs  allowed  in  the  COBOL  procedure  division, 
together  with  various  parsing  and  grammar  manipulating  tools 
(described  in  Section  2),  not  only  specifies  the 
correspondence  between  COBOL  and  Abstract  COBOL,  but  also 
constitutes  an  efficient  algorithm  for  translating  between 
the  two  languages.  As  a result  of  this  translation,  while  a 
user  may  submit  to  the  COBOL  Verifier  a general  COBOL 
program  (suitably  annotated  by  logical  assertions),  parts  of 
the  system  operating  after  transduction  need  to  deal  only 
with  a very  limited  set  of  semantic  primitives.  For 
example,  the  translation  expresses  all  ADD,  SUBTRACT, 
MULTIPLY,  DIVIDE,  COMPUTE,  and  MOVE  sentences  (except  for 
the  CORRESPONDING  option,  which  is  handled  separately)  In 
terms  of  two  semantic  primitives  SET$  and  SETROUNDED$. 
Similarly,  GO  TO  ...  DEPENDING  ON  ...  sentences  are 
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expressed  in  Abstract  COBOL  by  an  equivalent  set  of  IF  and 
GO  TO  sentences.  A detailed  description  of  these 
correspondences,  and  of  the  primitives  of  Abstract  COBOL,  is 
given  In  Section  3. 

Finally,  observe  the  advantage  that  derives  from 
employing  a COBOL  Iransduction  grammar  (CTG)  to  drive  the 
transducer.  Although  we  have  made  a number  of  simplifying 
assumptions  for  the  purposes  of  the  initial  phase  of  the 
project  activity,  it  will  be  a simple  matter  to  extend  the 
subset  of  COBOL  that  is  accepted  Just  by  augmenting  the  CTG. 
Such  extensions  require  no  modification  of  the  transducer. 

2.  The  COBOL  Language 

A.  Amendments  to  the  Language 

For  the  purpose  of  this  project,  we  have  designed  a CTG 
for  the  COBOL  procedure  division  which  does  not  precisely 
correspond  to  the  language  described  in  the  1974  ANSI 
standard  for  COBOL  (q.v.  American  National  Standard 
Programming  Language  COBOL,  American  National  Standards 
Institute,  Standard  Number  X3. 23-74).  Our  amendments  are  of 
two  sorts  and  we  now  proceed  to  describe  them  in  turn. 

The  first  sort  of  amendment  moderately  extends  the 
language.  In  the  arithmetio  senteneea  ADD,  DIfIDE, 
MULTIPLY,  and  SUBTRACT,  our  CTG  permits  the  arguments  to  be 
arbitrary  arithmetic  expresalons  rather  than  Juat 
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identifiers  or  literals.  In  the  GO  TO. .. DEPENDING  ON... 
sentence,  we  similarly  generalize  the  qualifying  identifier 
so  as  to  allow  any  arithmetic  expression.  In  the 

PERFORM ...  AFTER .. . construct,  we  allow  arbitrary  nesting 
rather  than  a maximum  of  three  levels  as  in  X3.23.  More 
importantly,  the  CTG  specifies  a generalized  (but  fully 
compatible)  PERFORM  statement  which,  for  example,  permits 
the  construction 

PERFORM  procedure-namel  VARYING  I FROM  1 BY  1 UNTIL  1=10 

5 TIMES 

This  construction  is  not  allowed  in  X3.23  but  is 

semantically  consistent  with  it  when  given  the  meaning 

PERFORM  procedure-namel  VARYING  I FROM  1 BY  1 UNTIL  1=10 

VARYING  J FROM  1 BY  1 UNTIL  J=5 

where  J is  some  new  identifier  not  otherwise  used  in  the 

program . 

These  extensions  are  permitted  for  a number  of  reasons. 
First,  because  they  are  semantically  consistent  with  X3.23. 
it  is  no  more  difficult  to  verify  programs  written  in  the 
more  general  forms.  Second,  since  the  generalized  forms  are 
more  syntactically  natural  (i.e.  yield  greater  syntactic 
uniformity  in  the  resulting  language)  than  the  original 
forms,  the  CTG  is  shorter  and  clearer  than  it  would 
otherwise  be.  Finally,  we  could  easily  augment  appropriate 
rules  of  the  CTG  to  exclude  these  extensions  if,  for  some 
reason,  that  was  eventually  found  desirable.  But,  in  any 
case,  since  the  extensions  are  all  compatible  with  X3>23i 
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♦■he  CTG  does  correctly  specify  the  transduction  into 
Abstract  COBOL  of  standard  language  programs  not  employing 
the  extensions. 

Our  second  sort  of  amendment  has  consisted  of 
subsetting  the  procedure  division  so  that,  in  this  limited 
initial  effort,  we  can  deal  with  a language  of  manageable 
proportions.  On  the  other  hand,  we  wish  to  Include  enough 
of  COBOL  to  demonstrate  the  practicality  of  applying 
verification  techniques  to  COBOL,  as  well  as  to  begin  to 
detail  the  techniques  required.  Thus  we  have  tried  to 
choose  a group  of  verbs  that  is  representative  of  COBOL  and, 
moreover,  is  sufficient  in  scope  to  permit  the  writing  and 
verification  of  some  reasonable  example  programs.  The 
technique  of  table-driven  syntax  transduction  makes  it  quite 
easy  to  extend  the  subset  with  which  we  deal , and  such 
extension  would  be  a natural  part  of  a continuation  of  the 
present  effort. 

The  particular  subset  we  have  chosen  includes  the  verbs 
ACCEPT*,  ADD,  COMPUTE,  CLOSE*,  DISPLAY,  DIVIDE,  GO,  IF, 
MOVE,  MULTIPLY,  OPEN*,  PERFORM,  READ,  STOP*,  SUBTRACT,  and 
WRITE.  Asterisks  indicate  verbs  for  which  the  CTG  allows 
only  a subset  of  the  alternative  constructions  in  X3.23.  We 
have  excluded  verbs  dealing  with  string  manipulation,  table 
handling,  merge  and  sort  operations,  error  processing  and 
debugging,  complex  file  processing.  Interprocess 
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communications  and  multi-processing,  and  report  generation. 
The  verbs  thus  excluded  are  ALTEFi,  DELETE,  DISABLE,  ENABLE, 
ENTER,  EXIT,  GENERATE,  INITIATE,  INSPECT,  MERGE,  RECEIVE, 
RELEASE,  RETURN,  REWRITE,  SEARCH,  SET,  SORT,  STRING, 
SUPPRESS,  TERMINATE,  UNSTRING,  and  USE.  We  have  also 
excluded  the  'abbreviated  combined  conditional'  relational 
expression . 

B.  The  Correspondence  between  COBOL  and  Abstract  COBOL 

The  object  of  this  section  is  to  describe,  in  general 
terms,  the  correspondence  between  standard  COBOL  programs 
and  their  transduced  versions  in  Abstract  COBOL.  Strictly 
speaking,  the  CTG  as  given  in  Appendix  A is  the  definitive 
specification  of  this  correspondence — it  is  both  exact  and 
procedural.  However,  the  level  of  detail  in  Appendix  A, 
together  with  the  formal  languages  of  transduction  grammar 
and  CLISP  that  are  used,  may  be  quite  difficult  for  the 
uninitiated  reader.  Consequently  we  give  a more  tutorial 
presentation  here. 

The  procedure  division  of  a COBOL  program  consists  of  a 
number  of  labeled  sections . each  of  which  is  made  up  of  a 
number  of  labeled  paragraphs . These  paragraphs,  in  turn, 
are  made  up  of  a variable  number  of  sentences. 
Alternatively,  a program  may  omit  the  intermediate  level 
(sections)  and  consist  simply  of  a number  of  paragraphs. 
Both  cases  are  represented  in  Abstract  COBOL  by  lists  of  the 

I 
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form 

(PROCEDUREDIVISIONI 

(SECTIONI  section-name  1 

(PARAGRAPHS  paragraph-namea 

sentence  sentence  . . . ) 

■ • • 

(PARAGRAPHS  paragraph-namez 

sentence  sentence  ...)) 

(SECTIONS  section-namen 

(PARAGRAPHS  paragraph-nameaa 

sentence  sentence  ...) 

• • • 

(PARAGRAPHS  paragraph-namezz 

sentence  sentence  ...))) 

We  represent  the  case  of  a 'sectionless'  program  by  taking 
nsl  and  section-name  1 :NIL,  i.e.,  by 

(PR0CEDUREDIV2SI0NS 

(SECTIONS  NIL 

(PARAGRAPHS  paragraph-namea 

sentence  sentence  ...) 

(PARAGRAPHS  paragraph-nameb 

sentence  sentence  ...))) 


The  sentences  of  Abstract  COBOL  serve  the  same  function 
as  the  sentences  of  COBOL  in  that  they  serve  as  the  building 
blocks  of  the  language.  Standard  COBOL  sentences  are  of  two 
kinds--those  represented  in  Abstract  COBOL  by  a single 
internal  sentence  and  those  represented  in  Abstract  COBOL  by 
several  internal  sentences.  The  first  sort  of  sentence  is 
defined  by  the  nonterminal  'sentencel'  in  the  CTG.  Such 
sentences  are  those  using  one  of  the  following  verbs: 
ACCEPT,  CLOSE,  GO,  IF,  PERFORM,  READ,  STOP,  WRITE,  ADD  CORR, 
and  SUBTRACT  CORR.  The  second  sort  of  sentence — comprising 
sentences  with  the  verbs  COMPUTE,  DISPLAY,  DIVIDE, 
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GO. . .DEPENDING,  OPEN,  MOVE,  ADD,  DIVIDE,  MULTIPLY,  or 
SUBTRACT — is  represented  in  the  CTG  by  the  nonterminal 
'sentence?'.  To  some  extent  there  Is  a possible  trade-off 
between  the  designation  of  the  type  of  a COBOL  sentence  and 
the  complexity  of  the  associated  internal  semantic 
primitives.  That  is,  forcing  a sentence  to  be  of  the  first 
type  may  require  the  use  of  more  complex  internal  primitives 
than  would  transducing  as  a 'sentence?'  to  a list  of 
internal  sentences.  To  increase  simplicity  in  the 
verification  condition  generator  and  COBOL  axiomatizatlon , 
we  have  therefore  chosen  to  transduce  to  lists  of  simpler 
internal  sentences  where  possible. 

We  now  sketch  the  Internal  equivalents  of  the  various 
COBOL  sentences.  Those  derived  from  the  nonterminal 
'sentencel'  are  described  first.  Among  these  sentences, 
those  with  verbs  ACCEPT,  CLOSE,  GO  (the  simple  case),  IF, 
READ,  STOP,  and  WRITE  are  straightforwardly  transduced.  For 
example , 

ACCEPT  x FROM  DAYTIME 

becomes 

(ACCEPT  X DAYTIME) 

where  ACCEPT  is  considered,  in  Abstract  COBOL,  to  be  a 
semantically  primitive  function  of  two  arguments. 
Similarly , 

IF  x>3<10;  si;  ELSE  s?. 


becomes 
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(IF  (LT$  (♦  3 x)  10)  si  ' s2') 

where  si'  and  s2'  are  the  Abstract  COBOL  equivalents  of  the 
external  sentences  si  and  s2. 

Note  that  the  CTG  rules  for  condition  and  arithmetic 
expression  yield  the  functional  form  (LT$  (•♦•  3 x)  10)  for 
the  COBOL  infix  expression  x-f3<10.  The  CTG  translates  any 
condition  or  arithmetic  expression  into  a functional  form 
employing  only  operators  chosen  from  ♦,  -,  •,  /,  LT$,  EQ$  or 
GT$,  or  the  logical  operators  AND,  OR,  NOT,  ISALPHABETIC$ , 
and  ISNUMERIC$. 

There  are  two  somewhat  more  complex  cases.  The  first 
of  these  is  the  PERFORM  statement.  We  analyze  a PERFORM 
statement  into  three  parts:  the  verb  PERFORM,  a body  such  as 
FUM  THROUGH  FUMBAR,  and  a list  of  controls.  Each  control  is 
a qualifier  such  as  7 TIMES  or  AFTER  J FROM  1 BY  3 UNTIL  J 
IS  GREATER  THAN  15  and  is  analyzed  into  a keyword  and  a 
parameter.  For  example,  the  control  7 TIMES  has  keyword 
TIMES  and  parameter  7.  The  control  AFTER  J FROM  1 BY  3 
UNTIL  J IS  GREATER  THAN  15  has  keyword  VARYING  and  parameter 
(J  1 3 (GT|  J 15)).  COBOL  allows  two  other  sorts  of 
control;  these  are  'UNTIL  condition'  and  a defaulted  control 
(as  in  'PERFORM  FUM').  The  first  of  these  has  keyword  UNTIL 
and  as  parameter  the  transduction  of  the  condition.  The 
second  has  keyword  ONCE  and  parameter  NIL.  Suppose  the 
external  form  of  a PERFORM  statement  is 
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I 

PEhFOhM  body  control(l)  ...  control(n). 

For  each  i between  1 and  n,  let  kcontrol(i)  be  the  keyword 
associated  with  control(i)  and  let  pcontrol(l)  be  its 
parameter.  Then  we  transduce  to  the  internal  equivalent 
(PEKFORM  kcontrol(n) 

(PERFORM  kcontrol(n-1 ) ...  pcontrol (n- 1 ) NIL) 
pcontrol (n ) 

NIL) 

Thus  the  semantic  primitive  PERFORM  in  Abstract  COBOL  takes 
four  arguments;  a control  keyword,  a transduction  of  the 
body  which  is  either 

( DO|  procedure-name  1 procedure-name?) 

(the  simple  case)  or  the  transduction  of  the  nested  inner 
PERFORM,  a control  parameter  list,  and  a final  argument 
which  ls--at  present — NIL.  As  a more  complex  example, 
consider  the  sentence 

PERFORM  PARI  VARYING  1 FROM  1 BY  1 UNTIL  1=10 
7 TIMES 
UNTIL  X<10. 

In  Internal  form,  this  will  be  represented  by 

(PERFORM  VARYING 

(PERFORM  TIMES 

(PERFORM  UNTIL 

(D0$  PARI  PARI) 

(LT$  X 10) 

NIL) 

7 

NIL) 

(I  1 1 (L0$  I 10)) 

NIL) 


Observe  that  when  we  proceed,  In  planned  project  work, 
to  use  the  transduced  program  as  an  input  to  a verification 
condition  generator,  an  additional  item  of  information  will  | 

i 

f 
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be  needed  for  each  Iteration.  This  will  be  an  inductive 
invariant  which  describes  the  logical  behavior  of  the 
iteration  body,  and  it  will  be  recorded  in  the  final 
argument  position  of  the  corresponding  PERFORM.  Ideally, 
one  would  prefer  to  have  a verification  system  synthesize 
such  an  invariant  on  the  basis  of  the  program  text,  but  it 
is  not  possible  to  do  so  in  real  programs  given  the  present 
state  of  the  art  of  verification. 

The  remaining  derivatives  of  'sentence!'  are  those 
using  ADD  CORR  and  SUBTRACT  CORR.  These  translate  to  calls 
on  the  semantic  primitives  ADDCOhRESPONDING$  and 
SUBTRACTCORRESPONDINGI  with  four  arguments:  the  two  apparent 
subjects  of  the  COBOL  sentence,  either  ROUNDED  or  NIL  as 
specified  by  the  external  sentence,  and  a transduction  of 
the  COBOL  imperative  sentence  that  is  to  be  executed  if  a 
SIZE  ERROR  occurs. 

We  now  describe  the  translation  of  derivatives  of 
'sentenceP'  in  the  grammar.  Recall  that  these  generally 
translate  to  several  internal  sentences.  To  represent  the 
COMPUTE  and  other  arithmetic  sentences,  we  introduce  the 
semantic  functions  SET$  and  SETROUNDED$.  Each  is  a function 
of  three  arguments:  the  target  of  the  operation,  the  source 
expression,  and  an  error  sentence  analogous  to  the  third 
ADDCORRESPONDING$  argument  as  described  above.  For  example, 
consider  the  COBOL  sentence 


i 

I 
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ADD  x,y  TO  2,w  ROUNDED. 

We  translate  this  to  the  two  internal  sentences 

(SET$  z (f  (♦  z y)  x)  NIL) 

(SETHOUNDEDI  w {♦  (♦  w y)  x)  NIL) 

Other  arithmetic  verbs  are  handled  in  the  same  fashion,  with 

the  CTG  transductions  creating  the  proper  functional  form 

source  expression.  Observe  that  in  the  example  a SIZE  ERROR 

Imperative  statement  is  omitted;  if  it  were  present  then  its 

transduction  would  appear  in  the  proper  argument  positions 

in  each  of  the  resulting  internal  sentences.  We  handle  MOVE 

in  the  same  way,  translating 

MOVE  X TO  y. 


to 


(SET$  y X NIL) 

For  the  MOVE  CORRESPONDING  statement,  we  introduce  the 
semantic  primitive  MOVECORRESPONDING$  and  translate,  for 
example , 

MOVE  CORR  a OF  x TO  b OF  y. 
to  the  internal  form 

(MOVECORRESPONDING$  (OF  y (b))  (OF  x (a))) 
where  the  OF  subexpressions  are  the  internal  renditions  of 
COBOL  qualifications. 


The  internal  primitive  DISPLAY  is  similar  to  ACCEPT 
described  above.  However,  since  the  COBOL  language  allows 
DISPLAY  to  take  a list  of  arguments,  we  transduce  to  a list 
of  Internal  DISPLAYS,  e.g.. 
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DISPLAY  x,y,z  UPON  PHINTEH. 

becomes 

(DISPLAY  X PRINTER) 

(DISPLAY  y PRINTER) 

(DISPLAY  z PRINTER) 

OPEN  Is  translated  In  a similar  way,  using  the  three 
internal  primitives  0PENINPUT$,  OPENOUTPUT$,  and  OPENBOTH$. 
For  example, 

OPEN  I-O  file1,file2. 

becomes 

(OPENBOTH$  rilel) 

(OPENBOTH$  flle2) 

Finally,  we  describe  the  transduction  of  GO ...  DEPENDING 

sentences.  In  general,  such  a sentence  has  the  form 

GO  TO  n1,n2,...  DEPENDING  ON  expression. 

and  is  translated  as  though  it  had  been  the  sentence 

IF  expressions  1 ; GO  TO  nl;  ELSE 
IF  expres3ion=2;  GO  TO  n2;  ELSE 

which,  rendered  in  Abstract  COBOL,  is 

(iF  ( E0$  expression  1) 

(GO  nl) 

(IF  (EU$  expression  2) 

(GO  n2) 

. . . ) . 

3.  Interactive  Facilities 

He  have  developed  a variety  of  Interactive  facilities 
to  support  the  construction  of  CTGs  and  the  subsequent 
psrslng  of  COBOL  programs.  The  system  we  describe  is 
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written  in  the  LISP  programming  language  and  runs  under 
INTEHLISP  (q.v.  INTEHLISP  Reference  Manual  by  Warren 
Teitelman,  Xerox  Palo  Alto  Research  Center). 

The  two  basic  functions  used  to  create  a CTG  are 
PUTRULES  and  PUTTRANS.  They  are  both  variadic  functions 
whose  first  argument  is  a nonterminal  of  the  grammar  and 
whose  subsequent  arguments  are,  respectively,  the  production 
rules  and  transductions  for  the  nonterminal.  When  a 
nonterminal  is  initially  used  as  first  argument  to  PUTRULES 
or  PUTTRANS,  it  is  appended  to  the  list  NONTERMS  of 
nonterminals  thus  far  in  the  grammar.  To  distinguish  this 
case,  PUTRULES  returns  the  nonterminal  as  its  result; 
otherwise  it  returns  NIL.  The  first  nonterminal  introduced 
becomes  the  root  symbol  of  the  grammar  (e.g.,  the 
nonterminal  ' procedured Ivlsion ' in  the  CTG  of  Appendix  A). 

Once  a nonterminal  and  some  corresponding  <productlon, 
transduction>  pairs  have  been  specified  in  this  way, 
adjustments  to  the  grammar  may  be  made  by  using  PUTRULES  and 
PUTTRANS  to  add  additional  alternatives  for  the  nonterminal 
(or  for  other  nonterminals),  and  by  using  the  INTERLISP 
editing  facilities  to  modify  the  alternatives  then  in 
effect.  In  particular,  EDITV(NONTERMS)  will  allow  the  user 
to  modify  the  list  of  nonterminals  and  EDlTP(nt)  will  allow 
the  user  to  modify  the  productions  and  transductions  of  a 
particular  nonterminal. 
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The  grammar  (or  any  part  of  it)  may  be  listed  in  a 
readable  format  (as  in  Appendices  A and  B)  by  calling 
PRINTGRAMMAH  with  any  subset  of  NONTERMS  as  argument. 
Appendix  A contains  such  a listing  for  the  subset  of  the 
COBOL  procedure  division  we  have  selected.  Each  nonterminal 
is  printed  along  with  a list  of  <production,  transduction> 
pairs — one  for  each  alternative.  We  have  adopted  the 
convention  that  lower  case  symbols  denote  nonterminals  while 
upper  case  symbols  and  delimiters  denote  terminals.  Also, 
note  that  the  transductions  are  printed  in  the  CLISP 
conversational  dialect  ol  LISP  for  increased  conciseness  and 
readability.  (In  this  dialect,  described  in  detail  in 
Chapter  23  of  the  INTERLISP  Reference  Manual,  angle  brackets 
('<'  and  '>')  denote  the  list  consisting  of  the  bracketed 
elements.  Thus  <A  B <C>>  is  equivalent  to  (LIST  A B (LIST 
C)).  However,  an  exclamation  point  indicates  that  the 
following  element  is  to  be  inserted  as  a segment,  e.g.  <!  A 
b ! C>  is  equivalent  to  (APPEND  A (LIST  B)  C).  Other 
rotational  innovations  of  CLISP  that  we  use  freely  are 
apostrophe  (')  to  quote  the  symbol  or  form  that  it  precedes 
and  colon  and  double  colon  as  infix  operators.  X:I,  where  I 
is  an  integer  denotes  the  Ith  element  of  the  list  X;  X::I 
denotes  the  Ith  tail  of  X.) 

Once  the  grammar  has  been  refined  to  the  user's 


satisfaction , 

it  may  be 

saved 

in  a 

symbolic 

file  for 

subsequent 

reference 

by 

the 

function 

call 
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SAV£GHAM( filename) , which  will  also  sort  the  nonterminals 
into  alphabetic  order  (except  for  the  root  symbol  which 
remains  the  first  element  of  NONTERMS).  Prior  to  this  call, 
the  user  may  also  wish  to  sort  the  alternatives  for  each 
nonterminal  into  lexicographical  order  (based  on  the 
productions  of  the  alternatives).  This  is  done  by  the  call 
SORTRULES(NONTERMS) . 

When  the  grammar  is  completed,  the  system  may  be  used 
to  transduce  COBOL  programs  within  the  COBOL  subset  that  has 
been  defined.  There  are  two  functions  available  for  this 
purpose--PURIFY  and  ABSTRACT.  The  first  of  these 
automatically  transforms  the  grammar  to  an  equivalent  one 
that  contains  no  erasing  rules.  This  is  important  because 
the  many  optional  words  in  the  COBOL  language  lead  to 
erasing  nonterminals  in  the  grammar  (e.g.,  'at'  and  'is'  in 
Appendix  A).  However,  our  parser  has  been  designed  to  deal 
only  with  grammars  without  erasing  rules;  this  permits  a 
simpler  and  more  efficient  parser  than  would  otherwise  be 
possible.  Consequently,  a 'purification'  process  is  needed 
to  obtain  a grammar  acceptable  to  the  parser.  The  effect  of 
this  process  on  a CTG  may  be  seen  by  comparing  Appendices  A 
and  B.  For  example,  the  nonterminal  'sentence!'  has  nine 
alternatives  in  the  original  grammar  but  requires 
twenty-three  in  the  purified  grammar  to  make  up  for  the 
absence  of  erasing  rules.  A purified  grammar  may  be  saved 
with  SAVEGRAM  as  described  in  the  previous  paragraph. 
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Once  these  preliminaries  are  complete,  it  is  possible 

to  parse  a COBOL  program.  The  user  must  enter  the  program 
into  the  LISP  environment  and  then  invoke  the  function 

ABSTRACT  providing  two  arguments — the  program  and  the 

function  COBOLTOKENFN . The  program  is  then  parsed  and 

transduced  and  the  resulting  value  of  ABSTRACT  is  the 

translated  program  in  Abstract  COBOL.  Appendix  C contains 

an  example  of  this  process:  part  1 is  a simple  COBOL  program 

and  part  3 is  the  abstract  form  of  the  program.  Part  2, 

included  here  for  completeness  but  usually  of  no  interest  to 

a user,  shows  the  parse  tree  which  is  constructed  from  the 

input  program  prior  to  the  invocation  of  the  transductions 

of  the  CTG. 

Finally,  let  us  describe  the  use  of  COBOLTOKENFN  by 
ABSTRACT.  The  reader  will  observe  that  no  rules  are  given 
for  three  nonterminals  of  the  CTG-- 'symbol  ' , 'number',  and 
'string'.  This  is  because  they  correspond  to  the  basic 
lexical  symbols,  numeric  constants,  and  textual  constants 
permitted  in  COBOL  which  are,  naturally,  much  too  numerous 
to  be  listed  explicitly.  Instead,  as  each  lexical  token  of 
an  input  program  is  read  by  the  parser,  COBOLTOKENFN  is 
invoked  to  check  whether  it  is  a symbol,  number,  or  string. 
If  so,  the  appropriate  rule  alternatives  are  added 
dynamically  to  these  nonterminals  so  that  parsing  may 
proceed  successfully. 
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Appendix  A.  COBOL  Transduction  Grammar 


proceduredi vision 

z PROCEDURE  DIVISION  . paragraphs 

( < 'PROCEDUREOIVISION}  <'SECTION$  NIL  ! T4>>) 

z PROCEDURE  DIVISION  . sections 
(< 'PROCEDUREDIVISION$  ! T4>) 


argument 


expressions 
(T1 ) 

expressions  connector  expression 
( ( if  T2  NEQ  ( 'BY) 

then  <!  T1  T3>  elseif  Tl::1  then  (HELP 

"Error  in  reduction  to 

argumen  t . " ) 

else  <T3  ! T1>)) 


at 


(NIL) 

z AT 
(NIL) 


classcondition 

z ALPHABETIC 

( 'ISALPHABETIO 

Z NUMERIC 

( 'ISNUMERIO 


computetarget 


computetarget  1 
(<TO) 
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= identifier  , computetarget 
(<<'SET$  T1>  ! T3>) 

= identifier  ROUNDED  , computetarget 
(<< 'SETHOUNDED$  T1>  ! T4>) 


computetarget  1 

= identifier 
(<'SET$  T1>) 

= Identifier  ROUNDED 
(< 'SETROUNDED$  T1>) 

condition 

= condition  OR  condition^ 

(<T2  T1  T3>) 

= condition2 
(T1) 

cond it ion2 

r condition2  AND 
(<T2  T1  T3>) 

cond ition3 

= condition3 
(T1) 

condition3 

= NOT  condition3 
(<T1  T2>) 

= condition^ 

(T1) 

condition^ 


( condition  ) 
(T2) 
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simplecond ition 
(T1 ) 


cond itionname 

= symbol 
(T1  ) 


connector 

= BY 
(T1 ) 

= FROM 
(T1) 

z INTO 
(T1  ) 

z TO 
(Tl) 


corresponding 

z CORF 
(NIL) 

z CORRESPONDING 
(NIL) 


corrop 
z ADD 

( 'ADDCORRESPONDINGD 
z SUBTRACT 

( 'SUETRACTC0RRESP0NDINC$) 


dividearguments 

z expression  BY  expression 
(<T1  T3>) 
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expression  INTO  expression 
(<T3  T1>) 


else 

= BLSE 
(NIL) 

= OTHERWISE 
(NIL) 


elseclause 


( 'NEXT  ) 

= semi  else  NEXT  SENTENCE 
('NEXT) 

= semi  else  sentence 
(T3) 


endcondition 

(NIL) 

r ; at  END  sentence 
(T4) 


errorcondition 

(ML) 

r ; on  SIZE  ERROR  sentence 
(T5) 


expression 

= expression  ♦ expressionZ 
(<T2  T1  T3>) 
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expression? 

(Tl) 


expression? 

= expression?  • expressions 
(<T?  Tl  T3>) 

= expression?  / expressions 
(<T?  Tl  TS>) 

r expressions 
(Tl) 


expressions 

= expressions  ••  expression^ 
(<T?  Tl  TS>) 

= expression4 
(Tl) 


expression^ 

= ( expression  ) 

(T?) 

= expression!* 
(T?) 

s - expression!* 
(<T1  0 T?>) 

z ZERO 
(0) 

z ZEROES 
(0) 

z ZEROS 
(0) 

z identifier 
(Tl) 


number 

(Tl) 
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s string 
(T1) 


expressions 

r expression 
(<T1>) 

s expression  , expressions 
(<T1  ! T3>) 


filename 

= symbol 
(T1) 


filenames 

= filename 
(<T1>) 

s filename  , filenames 
(<T1  ! T3>) 


identi fier 

= qualification 
((if  (NLISTP  T1) 

then  T1  elseif  T1:1='0F  and  T1:3=NIL  then  T1:2  else 
(HELP  'Errort  in%  reduction!  to!  identifier.))) 

z qualification  ( subscripts  ) 

(<'0F  (if  (NLISTP  T1) 

then  T1  elseif  T1;1='0F  and  T1:3=N1L  then  T1:2 

else 

(HELP  'Error!  in!  reduction!  to!  identifier.)) 

T3>) 


identi flers 

* identifier 
(<T1>) 
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= identifier  , identifiers 

(<n  ' T3>) 


indexnarae 

= symbol 
(T1) 


iotype 

= INPUT 

( 'OPENINPUT$) 

= 10 

( '0PENB0TH$) 

= OUTPUT 

( 'OPENOUTPUTD 


is 


(NIL) 

z IS 
(NIL) 


mnemonicname 

r symbol 
(T1) 


move 


z MOVE 
( 'SET!) 

z MOVE  corresponding 
( 'MOVECORRES PONDING!) 
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of 


= IN 
(ML) 

s OF 
(NIL) 


on 


(NIL) 

= ON 
(NIL) 


operator 

= ADD 
( '♦) 

= DIVIDE 
( '/) 

= MULTIPLY 
('•) 

= SUBTRACT 
('-) 


paragraph 

= paragraphnaoe  . sentences 
(< 'PARAGRAPH$  T1  ! T3>) 


paragraphname 

* symbol 
(T1) 


oaragraphs 
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naragraph 

(<T1>) 

paragraph  paragraphs 
(<T1  ! T2>) 


per formbody 

= procedurename 
(<  'D0$  T1  T1>) 

- procedurename  thru  procedurename 
(<  'D0$  T1  T3>) 


per formcontrol 

= UNTIL  condition 
(<T2  T1>) 

= expression  TIMES 
(<T2  T1>) 

= varying  expression  FROM  expression  BY  expression  UNTIL 
condition 

(<T1  <T2  Ti4  T6  T8») 


per formcontrol s 
(NIL) 

= per formcontrol  performcontrols 
(<T1  ! T2>) 


procedurename 

s symbol 
(T1) 


procedurenames 


procedurename 

(<T1>) 
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I) 


procedurename  , procedurenames 
(<T1  ! T3>) 


qualification 

= symbol 

(<'0F  T1  NIL>) 

s symbol  of  qualification 
(<T3:  1 T3:2  <!  T3: 3 T1>>) 


readtarget 

(NIL) 

= INTO  identifier 
(T2) 


record 


(NIL) 

HECORD 

(NIL) 


recordname 


symbol 

(T1) 


relationoperator 

« NOT  relationoperator2 

((SELECTO  T2  ((QUOTE  E0$ ) 
'NEQ$) 

((QUOTE  NEQ$) 
'EQ$) 

((QUOTE  LTD 
'GT0$) 

((QUOTE  GTQD 
'LTD 


V 
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((QUOTE  LTC$) 

'GTD 

((QUOTE  GT$) 

'LTC$) 

(HELP  ' 

"Error  in  reduction  of  first  alternative  of 
relationoperator .”))) 

relationoperator2 

(T1) 


relationoperator 2 


< 

( 'LT$) 


( 'EQD 


( 'GTD 

= EQUAL  to 
( 'EQ$) 

r CPEATER  than 
( 'GT$) 

: LESS  than 
( 'LT$) 


rounded 

(NIL) 

r HOUNDED 
(Tl) 


section 

= sectionname  SECTION  . paragraphs 
(< 'SECTIONS  Tl  ! T4>) 
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sectlonname 

= symbol 
(T1) 


sections 

= section 
(<T1>) 

= section  sections 
(<T1  ! T2>) 


semi 

(KID 

(NIL) 


sentence 

= sentence! 

(T1) 

r sentence2 

((if  T1:;1  then  <'D0$  ! T1>  else  Tl:l)) 


sentence  1 

= ACCEPT  identifier  source 
(<T1  T2  T3>) 

s CLOSE  filenames 
(<T1'  ! T3>) 

= GO  to  procedurename 
(<T1  T3>) 

s IF  condition  thenclause  elseclause 
(<T1  T2  T3  T4>) 

r PERFORM  performbody  per formcontrols 
((if  T3  then  (for  (X  R_T2) 


f 
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else  < 

'PERFORM 

T2  NIL 

NIL>) ) 

r READ 

filename  record 

(<T1 

T2  T4 

T5>) 

= STOP 

RUN 

(TI) 

in 

(REVERSE  T3) 
do  R_  < 'PERFORM 
(RETURN  R)) 
(0NCE$) 


X: 1 R X;2  NIL>  finally 


readtarget  endcondition 


= WRITE  recordname  writesource 
(<T1  T2  T3>) 

: corrop  corresponding  identifier  connector  identifier 

rounded 

errorcond i tion 

(<T1  T3  T5  T6  T7>) 


sen  tence2 

= COMPUTE  computetarget  = expression  errorcondition 
((for  X in  T2  collect  <!  X T4  T5>)) 

= DISPLAY  identifiers  target 

((for  X in  T2  collect  <T1  X T3>)) 

= DIVIDE  d i videargumen ts  GIVING  computetarget!  REMAINDER 
iden  ti f ier 

errorcondition 

(<<!  Ti*  <'/  ! T2>  T7>  < 'SET$  T6  <NIL  T2:l  <'•  T4:2  T2:2>> 
T7>>) 

= GO  to  procedurenaraes  DEPENDING  on  expression 
( ( for  I to  (LENGTH  T3) 
collect 

(<'IF  <'E0$  T6  I>  <'G0  (CAR  (NTH  T3  I)) 

> ' NEXT  >))) 

= OPEN  lotype  filenames 

((for  X in  T3  collect  <T2  X>)) 

= move  expression  TO  identifiers 

((for  X in  T4  collect  <T1  X T2  NIL>)) 

r operator  arguments  GIVING  computetarget  errorcondition 
((for  X in  T4  collect  <!  X (for  (Y  (R_  T2:-1)) 

in 

(REVERSE  T2) 
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finally 


T5>)) 


: : 1 do  R_  <T1  y R> 
(RETURN  R)) 


operator  expressions  connector  computetarget 
errorcondltlon 

((for  X In  T2  Join 

(for  Y In  TU  collect  <!  Y <T1  Y:2  X>  T5>))) 


sentences 


sentencel  . 

(<T1>) 

sentencel  . sentences 
(<T1  ' T3>) 

sentence2  . 

(T1 ) 

sentence2  . sentences 
(<!  n ! T3>) 


slgncondltlon 

= NEGATIVE 
('  (GT$  0)) 

= NOT  NEGATIVE 
( ' (LTQ$  0)) 

= NOT  POSITIVE 
( ' (GTQ$  0)) 

> NOT  ZERO 

( ' (NE0$  0)) 

z POSITIVE 
( ' (LT$  0)) 

z ZERO 

('  (t:o$  0)) 


slmplecondltlon 
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= conditionname 
(T1  ) 

= expression  is  relationoperator  expression 
(<T3  T1  m>) 

= expression  is  signcondition 
(< ! ! T3  T1>) 

= identifier  is  classcondition 
(<T3  T1>) 


source 


= FROM 

DATt 

(12) 

= FROM 

DAY 

(T2) 

: FROM 

TIME 

(T2) 

: FROM 

nnemon icname 

(T2) 

subscripts 

r expression 
(<T1>) 

= expression  , subscripts 

(<n  ! T3>) 


target 

(NIL) 

= UPON  mnemon icname 
(T2) 


than 
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(NIL) 

= THAN 
(NIL) 


thencl ause 

= NEXT  SENTENCE 
( 'NEXT) 

= semi  sentence 
(T2) 


thru 

= THROUGH 
(NIL) 

= THRU 
(NIL) 


to 


(NIL) 

= TO 
(NIL) 


varying 

r AFTER 

( 'VARYING) 

= VARYING 
( 'VARYING) 


writesource 


(NIL) 
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FHOH  identifier 
(T2) 


i 

1 
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Appendix  B.  COBOL  Non-erasing  Transduction  Grammar 


proceduredi vision 

= PROCEDURE  DIVISION  . paragraphs 

( < 'PROCEDUREDIVISIONS  <'SECTION$  NIL  ! T4>>) 

= PROCEDURE  DIVISION  . sections 
(< 'PROCEDUREDIVISIONI  ! T4>) 


argumen  t 

= expressions 
(T1) 


expressions  connector  expression 
( ( if  T2  NEO  ( 'bT) 

then  <!  T1  T3>  elself  T1::1  then  (HELP 

"Error  in  reduction  to 

argument . " ) 

else  <T3  ! T1>)) 


at 

= AT 
(NIL) 


classcondi tion 

= ALPHABETIC 

( 'ISALPHABETIO 

= NUMERIC 

{ 'ISNUMERIO 


computetarget 

3 computetarget  1 
(<T1>) 

3 identifier  , computetarget 
(<<'SET$  T1>  I T3>) 
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= identitier  ROUNDED  , computetarget 
(<< 'SETROUNDEDI  T1>  ! TH>) 


computetarget  1 

= identifier 
(< 'SET$  T1>) 

r identifier  ROUNDED 
(< 'SETROUNDED$  T1>) 

condition 

= condition  OR  condition2 
(<T2  T1  T3>) 

= condition2 
(T1) 

condition2 

= conditlon2  AND 
(<T2  T1  T3>) 

condition3 

= condition3 
(Tl) 

condltion3 

= NOT  conditlon3 
(<T1  T2>) 

= condition^ 

(Tl) 

cond ItionH 

s ( condition  ) 
(T2) 


simplecond Ition 
(T1) 
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cond itionname 

= symbol 
(T1) 


connector 

= BY 
(T1) 

= FROM 
(T1) 

s INTO 
(T1) 

= TO 
(T1) 


corresponding 

r CORN 
(NIL) 

r CORRESPONDING 
(NIL) 


corrop 
r ADD 

( 'ADDCORHESPONDINGD 
= SUBTRACT 

( 'SUBTRACTCORRESPONDING$) 


dividearguments 

s expression  BY  expression 
(<T1  T3>) 

s expression  INTO  expression 
(<T3  T1>) 
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el  se 

= ELSE 
(NIL) 

Z OTHERWISE 
(NIL) 


elseclause 

Z semi  else  NEXT  SENTENCE 
( 'NEXT) 

z semi  else  sentence 
(T3) 

z else  NEXT  SEL'TENCE 
( 'NEXT) 

z else  sentence 
(T2) 


endcondition 

z ; at  END  sentence 
(H4) 

z ; END  sentence 
(T3) 


errorcondition 

z ; on  SIZE  ERROR  sentence 
(T5) 

z ; SIZE  ERROR  sentence 
(Ti4) 


expression 

z expression  ♦ expression2 
(<T2  T1  T3>) 


t 


COBOL  Language,  Parsing,  and  Abstract  Form 


expression? 
(T1 ) 


expression? 

s expression?  * expressions 
(<T?  T1  T3>) 

= expression?  / expressions 

(<T? 

= expressions 
(T1) 


expressions 

r expressions  ••  expression*! 
(<T?  T1  TS>) 

= expression*! 

(T1) 


expressionU 

= ( expression  ) 

(T?) 

s ♦ expression*! 
(T?) 

= - expressionU 

(<T1  0 T?>) 

= ZERO 
(0) 

= ZEROES 
(0) 

= ZEROS 
(0) 

z identifier 
(T1) 


number 

(T1) 
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= string 
(T1 ) 


expressions 

= expression 
(<T1>) 

= expression  , expressions 
(<T1  ! T3>) 


filename 

s symbol 
(T1) 


filenames 

= filename 
(<T1>) 

= filename  , filenames 
(<T1  ! T3>) 


identi fier 

r qualification 
((if  (NLISTP  T1) 

then  T1  elseif  T1:1='0F  and  T1:3=NIL  then  Tl:2  else 
(HELP  'ErrorJ  in%  reduction^  tot  identifier.))) 

= qualification  ( subscripts  ) 

(< 'OF  (if  (NLISTP  T1) 

then  T1  elseif  T1:1='OF  and  T1:3sNIL  then  T1:2 

else 

(HELP  'Errort  int  reductiont  tot  identifier.)) 

T3>) 


Identi fiers 

s identifier 
(<T1>) 
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= identifier  , identifiers 
(<T1  ! T3>) 


indexname 

= symbol 
(T1 ) 


iotype 

r INPUT 

( 'OPEUINPUT*  ) 

= 10 

( 'OPENbOTH$) 

= OUTPUT 

( 'OPENOUTPUTS) 


is 

= IS 
(NIL) 


mnemon icname 

= symbol 
(T1 ) 


move 


= MOVE 
( 'SET$) 

= MOVE  corresponding 
( 'MOVECOhRESPONDlNO$) 


of 

s IN 
(NIL) 
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= OF 
(NIL) 


on 


= ON 
(NIL) 


operator 


= ADD 

( '-) 

= DIVIDE 
( '/) 

= MULTIPLY 
( '•) 


= SUBTRACT 

( 


paragraph 

= paragraphname  • sentences 
(< 'PARAGRAPHS  T1  J T3>) 


paragraphname 

= symbol 
(Tl) 


paragraphs 

= paragraph 
(<T1>) 

= paragraph  paragraphs 
(<T1  ! T2>) 
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per  formbody 

= procedurename 
( < 'D0$  T1  Tl>) 

= procedurename  thru  procedurename 
(< 'D0$  T1  T3>) 


per  formcontrol 

= UNTIL  condition 
(<T2  T1>) 

= expression  TIMES 
(<T2  T1>) 

= varying  expression  FROM  expression  BY  expression  UNTIL 
condition 

(<T1  <T2  Ti<  T6  T8>>) 


perforrocontrols 

= per formcontrol  per formcontrols 
(<I1  ! T2>) 

= per formcontrol 
(<T1>) 


procedurename 

s symbol 
(T1) 


procedurenames 

s procedurename 
(<T1>) 

= procedurename  , procedurenames 
(<T1  ! T3>) 


qualification 
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= symbol 

(<'0F  T1  NIL>) 

= symbol  of  qualification 
(<T3: 1 T3:2  <!  T3:3  T1>>) 


readtarget 

= INTO  identifier 
(T2) 


record 

= RECORD 
(NIL) 


recordname 

= symbol 
(T1) 


relation opera tor 

s NOT  relationoperatorP 

((SELECTQ  T2  ((QUOTE  EQ$) 

'NEQD 

((QUOTE  NE0$) 

'E0$) 

((QUOTE  LT$) 

'gtqd 

((QUOTE  GTQJ) 

'LTD 

((QUOTE  LT0$) 

'GTD 

( (QUOTE  CT$) 

'LTQ$) 

(HELP  ' 

"Error  in  reduction  of  first  alternative  of 
relatlonoperator . " ) ) ) 


relatlonoperator2 

(T1) 
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relation opera tor 2 


< 

( 'LT$) 


( 'EQ$) 

= > 

( 'GT$) 

= EQUAL  to 
( 'E0$) 

= GREATER  than 
( 'GT$) 

= LESS  than 
( 'LT$) 

= EQUAL 
( 'EQ$) 

= GREATER 
( 'GT$) 

= LESS 
( 'LT$) 


rounded 

= ROUNDED 
(T1  ) 

section 

= sectionname 
(< 'SECTION! 

SECTION  . paragraphs 
Tl  ! TU>) 

sectionname 

= symbol 
(Tl) 
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sections 
= section 

(<n>) 

= section  sections 
(<T1  ! T2>) 


semi 

(NIL) 


sentence 

: sentencel 

(T1) 

= sentence2 

((if  T1::1  then  <'D0$  ! T1>  else  T1:1)) 


sentence  1 

= ACCEPT  identifier  source 
(<T1  T2  T3>) 

= CLOSE  filenames 
(<T1  ! T3>) 

= GO  to  procedurename 
(<T1  T3>) 

= IF  condition  thenclause  elseclause 
(<T1  T2  T3  TJ^>) 

= PERFORM  performbody  per formcontrols 
((if  T3  then  (for  (X  R_T2) 

in 

(REVERSE  T3) 

do  R <'PERFORM  X:1  R X:2  NIL>  finally 
(RETURN  R)) 
else  < 'PERFORM  ' (ONCE$) 

T2  NIL  NIL>)) 

z READ  filename  record  readtarget  endcondition 
(<T1  T2  TM  T5>) 
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= STOP  PUN 
(T1  ) 

= WPITE  recordname  writesource 
(<T1  72  T3>) 

: corrop  corresponding  identifier  connector  identifier 

rounded 

errorcond i tion 

(<T1  T3  T5  16  T7>) 

= GO  procedurename 
(<T1  T2>) 

= IF  condition  thenclause 
(<T1  T2  T3  'NEXT  >) 

= PERFORM  per formbody 

((if  NIL  then  (for  (X  R_T2) 

in 

(REVERSE  NIL) 

do  R_  <'PERFORM  X:1  P X:2  NIL>  finally 
(RETURN  R)) 

else  <'PERF0RM  ' (ONCE$) 

T2  NIL  NIL>)) 

= READ  filenarre  readtarget  endcondition 
(<T1  T2  T3  Ti4>) 

= READ  filename  record  endcondition 
(<T1  T2  NIL  T'4>) 

= READ  filename  endcondition 
(<I 1 12  NIL  T3>) 

= READ  filename  record  readtarget 
(<T1  T2  TU  NIL>) 

= READ  filename  readtarget 
(<T1  T2  T3  ML>) 

= READ  filename  record 
(<T1  12  NIL  NIL>) 

= READ  filename 

(<T1  72  NIL  NIL>) 

= WRITE  recordname 
(<T1  72  N1L>) 

z corrop  corresponding  identifier  connector  identifier 
errorcond i tion 

(<T1  T3  T5  NIL  T6>) 
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corrop  corresponding  identifier  connector  identifier 
rounded 

(<T1  T3  T5  T6  NIL>) 

corrop  corresponding  identifier  connector  identifier 
(<T1  T3  T5  NIL  NIL>) 


sentenced 


= COMPUTE  computetarget  = expression  errorcond ition 
((for  X in  T2  collect  <!  X TM  T5>)) 

= DISPLAY  identifiers  target 

((for  X in  T2  collect  <T1  X T3>)) 

= DIVIDE  divideargunients  GIVING  computetarget  1 hEMAINDER 
identi fier 
errorcondition 

(<<!  TU  <*/  ! T2>  T7>  < 'SET$  T6  <NIL  T2: 1 <'•  T4:2  T2:2>> 
T7>>  ) 


= GO  to  procedurenames  DEPENDING  on  expression 
((for  I to  (LENGTH  T3) 
collect 

(<'IF  <'E0$  T6  I>  <'G0  (CAR  (NTH  T3  D) 
> ' NEXT  >))) 

= OPEN  iotype  filenames 

((for  X in  T3  collect  <T2  X>)) 

= move  expression  TO  identifiers 

((for  X in  T^*  collect  <T1  X T2  NIL>)) 


operator  arguments  GIVING  computetarget  errorcondition 
((for  X in  1^*  collect  <!  X (for  (Y  (R_  T2:-1)) 

in 


finally 


T5>)) 


(REVERSE  T2) 

: : 1 do  R_  <T1  Y R> 

(RETURN  R)) 


operator  expressions  connector  computetarget 
errorcondition 

((for  X in  T2  Join 

(for  Y in  TH  collect  <f  Y <T1  Y;2  X>  T5>))) 


COMPUTE  computetarget  s expression 
((for  X in  T2  collect  <1  X T»*  NIL>)) 
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= DISPLAY  Identifiers 

((for  X in  T2  collect  <T1  X NIL>)) 


: DIVIDE  d ividearguments  GIVING  computetarget  1 KEMAINDER 

identi f ier 

(<<!  T4  <'/  ! T2>  NIL>  < 'SET$  T6  <NIL  T2:1  <'•  T4:2 
T2:2>>  NIL>>) 


= GO  procedurenames  DEPENDING  on  expression 
((for  I to  (LENGTH  T2) 
collect 

(<'IF  <'E0$  T5  I>  <'GU  (CAR  (NTH  T2  I)) 

> ' NEXT  >))) 

r GO  to  procedurenames  DEPENDING  expression 
((for  I to  (LENGTH  T3) 
collect 

(<'IE  <'E0f  T5  I>  <'GO  (CAR  (NTH  T3  D) 

> ' NEXT  >))) 

= GO  procedurenames  DEPENDING  expression 
((for  I to  (LENGTH  T2) 
collect 

(<'IF  <'E0$  T4  I>  <'G0  (CAR  (NTH  T2  I)) 

> ' NEXT  >))) 


operator  arguments  GIVING  computetarget 
((for  X in  T4  collect  <!  X (for  (Y  (R_  T2:-1)) 

in 


finally 


N1L>)) 


(REVERSE  T2) 

: : 1 do  H_  <T1  Y R> 

(RETURN  R)) 


operator  expressions  connector  computetarget 
((for  X in  T2  Join 

(for  Y in  T4  collect  <»  Y <T1  Y;2  X>  NIL>))) 


sen  tences 

s sentencel  . 

(<T1>) 

= sentencel  . sentences 
(<T1  ! T3>) 


sentence2  . 
(T1) 
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3entence2  . sentences 
(<!  T1  ! T3>) 


slgncondi tlon 

= NEGATIVE 
( ' (GT$  0)) 

= NOT  NEGATIVE 
( ' (LT0$  0)  ) 

= NOT  POSITIVE 
( ' (GTQ$  0) ) 

= NOT  ZERO 

( ' (NEQ$  0)  ) 

= POSITIVE 
( ' (LTI  0)) 

= ZERO 

( ' (EQ$  0)) 


slmpleconditlon 

: cond 1 tlonname 

(T1) 

s expression  is  relationoperator  expression 
(<T3  T1  TM>) 

z expression  Is  slgncond Itlon 
(<! ! T3  T1>) 

= identifier  is  classcond itlon 
(<T3  T1>) 

= expression  relationoperator  expression 
(<T2  T1  T3>) 

= expression  slgncondition 
(< I ! T2  T1>) 

= identifier  classcondltlon 
(<T2  T1>) 
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source 


FhOM 

DATE 

(T2) 

FROM 

DAY 

(T2) 

FROM 

TIME 

(T2) 

FROM 

mnemonicname 

(T2) 

subscripts 

= expression 
(<T1>) 

r expression  , subscripts 
(<T1  ! T3>) 

t 

i target 

= UPON  mnemonicname 
(T2) 


than 

= THAN 
(NIL) 


thenclause 

= NEXT  SENTENCE 
( 'NEXT) 

s semi  sentence 
(T2) 


sentence 

(T1) 
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thru 

= THROUGH 
(NIL) 

= THRU 
(NIL) 


to 


= TO 
(NIL) 


varying 

= AFTER 

( 'VARYING) 

= VARYING 
( 'VARYING) 


wr itesource 

= FROM  identifier 
(T2) 
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Appendix  C.  A Sample  Transduction 


1 . A COLOL  Propram 


PROCEDURE  DIVISION  . 

START-HERE  . 

OPEN  INPUT  ACNT-KILE  . 

MOVE  ZERO  TO  STORE  . 

READ- IT  . 

READ  ACNT-FILE  ; AT  END  GO  TO  END-IT  . 
ADD  1 TO  STORE  . 

COMPARE  . 

IE  ACNT-NO  EQUAL  STORE  GO  READ-IT  . 
DISPLAY  STORE  . 

IE  STORE  EQUAL  99  STOP  RUN  . 

ADD  1 TO  STORE  . 

GO  COMPARE  . 

END-IT  . 

COMPUTE  STORE  = STORE  1 . 

IE  STORE  IS  GREATER  THAN  99  ; STOP  RUN  . 
DISPLAY  STORE  . 

GO  TO  END-IT  . 


2.  The  Corresponding  Parse  Tree 

( ( root  . 1 ) 

( ( proceduredivision  . 1) 

PROCEDURE  DIVISION  1. 

((paragraphs  . 2) 

((paragraph  . 1) 

( ( par agraphname  . 1)  ((symbol  . 1)  START-HERE)) 

9 

0 • 

((sehtences  . ^) 

((sentehce2  . 5) 

OPEN 

((iotype  . 1)  INPUT) 

( ( filenames  . 1 ) 

((filename  . 1)  ((symbol  . 2)  ACNT-FILE)))) 

%. 

( ( sentences  . 3 ) 

((sentence2  . 6) 

((move  . 1)  MOVE) 

((expression  . 2) 

( ( expression2  . 3) 

( (expression3  • 2)  ( ( expressionU  . 14)  ZERO)))) 
TO 

((identifiers  . 1) 

^identifier  . 1) 

((qualification  . 1)  ((symbol  . 3)  STORE))))) 

%.  ))) 

((paragraphs  . 2) 
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((paragraph  . 1) 

( ( paragraphname  . 1)  ((symbol  . 4)  KEAD-IT)) 

i. 

((sentences  . 2) 

((sentencel  . 15) 

READ 

((filename  . 1)  ((symbol  . 2)  ACNT-FILE)) 

( ( endcondition  . 1) 

((at  . 1)  AT) 

END. 

((sentence  . 1) 

((sentence!  . 3) 

GO 

((to  . 1)  TO) 

( ( procedurenarae  . 1)  ((symbol  . 5)  END-IT)))))) 

%. 

((sentences  . 3) 

((sentence2  . 16) 

((operator  . 1)  ADD) 

((expressions  . 1) 

((expression  . 2) 

( ( expression2  . 3) 

Uexpresslon3  • 2) 

( ( expression4  . 6)  ((number  . 1)  1)))))) 
((connector  . 4)  TO) 

( ( computetarget  . 1) 

( ( computetarget  1 . 1) 

((identifier  . 1) 

((qualification  . 1)  ((symbol  . 3)  STORE)))))) 

X.  ))) 

((paragraphs  . 2) 

( ( paragraph  . 1 ) 

((paragraphname  . 1)  ((symbol  . 6)  COMPARE)) 

X. 

((sentences  . 2) 

( ( sentence  1 . 11) 

IF 

((condition  . 2) 

((condition2  . 2) 

((conditions  • 2) 

((condltlon4  . 2) 

( (simplecondltlon  . 5) 

^expression  . 2) 

( (expres3lon2  . 3) 

^expressions  . 2) 

( (expression4  . 7) 

((identifier  . 1) 

((qualification  . 1) 

((symbol  . 7)  ACNT-NO )))))) ) 

( ( relationoperator  . 2) 

( ( relationoperator2  . 7)  EQUAL)) 
((expression  . 2) 
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( ( expressionP  . 3) 

( ( expressionB  • 2) 

( ( expres3ioni4  . 7 ) 

((identifier  . 1) 

( ( qual ification  . 1 ) 

((symbol  . 3)  STORE)))))))))))) 
((thenclause  . 3) 

( ( sentence  . 1 ) 

( ( sentence  1 . 10  ) 

GO 

( ( procedurename  . 1)  ((symbol  . 4)  FiEAD-IT ) ) ) ) ) ) 

i. 

((sentences  . 4) 

( ( 3entence2  . 10) 

DISPLAY 

((identifiers  . 1) 

((  identifier  . 1 ) 

((Qualification  . 1)  ((symbol  . 3)  STORE))))) 

%. 

((sentences  . 2) 

( ( sen  tence  1 . 11) 

IF 

( ( condition  . 2 ) 

( ( condi tlon2  . 2) 

( (condition3  • 2) 

((condition4  . 2) 

( ( simplecondition  . 5) 

((expression  . 2) 

( ( expre3Sion2  . 3) 

(( expression 3 • 2) 

( ( expression4  . 7) 

( ( identi f ier  . 1 ) 

((qualification  . 1) 

((symbol  . 3)  STORE))))))) 

( ( rel at ionoperator  . 2) 

( (relationoperator2  . 7)  EOUAL)) 
((expression  . 2) 

( ( expression2  . 3) 

( (expression3  • 2) 

( ( expression4  . 8) 

((number  . 2)  99)))))))))) 

( ( thenclause  . 3 ) 

((sentence  • 1)  ((sentencel  . 7)  STOP  RUN)))) 

%. 

((sentences  . 4) 

( ( sentence2  . 16) 

((operator  . 1)  ADD) 

( ( expressions  . 1 ) 

((expression  . 2) 

( (expression2  . 3) 

( ( expression3  • 2) 

( ( expression4  . 8)  ((number  . 1)  1)))))) 
((connector  . 4)  TO) 
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f ( computetarget  . 1) 

( ( computetarget  1 . 1) 

( ( identi fier  . 1 ) 

((qualification  . 1)  ((symbol  . 3)  STOHE)))))) 

t. 

( ( sentences  . 1 ) 

((sentencel  . 10) 

GO 

( ( procedurename  . 1)  ((symbol  . 6)  COMPARE))) 

X. )))))) 

((paragraphs  . 1) 

((paragraph  . 1) 

( ( paragraphname  . 1)  ((symbol  . 5)  END-IT)) 

X. 

( ( sentences  . 4 ) 

{ ( sentence2  . 9 ) 

COMPUTE 

((computetarget  . 1) 

(( computetarget  1 . 1) 

((identifier  . 1) 

((qualification  . 1)  ((symbol  . 3)  STORE))))) 

((expression  . 1) 

( ( expression  . 2 ) 

( (expression2  . 3) 

( ( expression3  . 2) 

( ( expressionU  . 7 ) 

((identifier  . 1) 

((qualification  . 1)  ((symbol  . 3) 

STORE))))))) 

( ( expression2  . 3) 

( ( expression 3 • 2) 

((expression^  . 8)  ((number  . 1)  1)))))) 

X. 

( ( sentences  . 2) 

((sentencel  . 11) 

IF 

((condition  . 2) 

( (condition2  . 2) 

((condition3  • 2) 

({conditional  . 2) 

( (simplecondition  . 2) 

((expression  . 2) 

( ( expression2  . 3) 

((expressions  . 2) 

(( expressions  . 7) 

((identifier  . 1) 

((qualification  . 1) 

((symbol  . 3)  STORE))))))) 

((is  . 1)  IS) 

( ( relationoperator  . 2) 

( ( relationoperator2  . 5) 
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GhtATKh 

((than  . 1)  THAN))) 

( ( expression  . 2 ) 

( (expression2  . 3) 

((expressions  • 2) 

( ( expression^  . 8) 

((number  . 2)  99)))))))))) 

((thenclause  . 2) 

( ( seni  . 1 ) ; ) 

((sentence  . 1)  ((sentencel  . 7)  STOP  HUN)))) 

S 

fi  • 

((sentences  . 4) 

( ( sentence2  . 10) 

DISPLAY 

( ( identi Piers  . 1 ) 

( ( identi fier  . 1 ) 

U qual i fication  . 1)  ((symbol  . 3)  STORE))))) 

<1 

0 • 

( ( sentences  . 1 ) 

((sentencel  . 3) 

00 

((to  . 1)  TO) 

( ( procedurenane  . 1)  ((symbol  . 5)  END-IT))) 

i. )))))))))) 

hPAD) 


3-  The  Corresponding  Abstract  Form 


( PhOCEDUHEDlVlSIONJ 
(SECTION?  NIL 

(PAhAGhAPHf  STAHT-HEHE 

(CPENINPUT?  ACNT-FILE) 

(SET?  STOKE  0 NIL)) 

(PAKAGHAPH?  KEAD-IT 

(HEAD  ACNT-FILE  NIL  (GO  END-IT)) 
(SET?  STORE  (+  SlUHt:  1)  ML)) 
(PAhAGhAPH?  COMPARE 

(IF  (EC$  ACNT-NO  STOKE)  (GO  KEAD-IT) 

NEXT) 

(DISPLAY  STOKE  NIL) 

(IF  (E0$  STOKE  99)  STOP  NEXT) 

(SET?  STOKE  STOKE  1)  NIL) 

(GO  COMPAhE)) 


(PARAGRAPH?  END-IT 

(SET?  STOKE  (-►  STORE  1)  NIL) 
(IF  (GT$  STORE  99)  STOP  NEXT) 


(DISPLAY  STORE  NIL) 
(GO  END-IT) ) ) ) 


APPENDIX  II 


Syntax  of  the  COBOL  DATA  DIVISION 
L.  Hobinson 

This  document  contains  the  syntax  of  the  DATA  DIVISION 
of  the  COBOL  subset  for  verification.  As  is  the  case  for 
the  PhOCEDURE  DIVISION,  the  language  is  described  as  a 
transduction  grammar.  At  this  point  in  time,  the 
transductions  for  the  DATA  DIVISION  grammar  have  not  been 
included.  The  objective  of  the  transductions  of  the 
PROCEDURE  DIVISION  is  to  create  a COBOL  program  in  abstract 
form.  The  transductions  of  the  DATA  DIVISION  can  be  used  to 
construct  a symbol  table  to  be  employed  by  the  COBOL 
verification  system. 

The  DATA  DIVISION  is  divided  into  two  parts,  the  FILE 
SECTION  and  the  wORKING-STORAGE  section.  The  FILE  SECTION 
contains  the  information  on  files  used  by  the  program,  and  a 
description  of  the  data  records  associated  with  the  file.  A 
data  record  contains  the  names  and  picture  specifications 
(i.e.,  the  declarations)  of  variables  used  in  the  program. 
The  WORKING-STORAGE  SECTION  is  used  to  declare  the  program 
variables  not  contained  in  data  records  of  files.  Variables 
in  the  WORKING-STORAGE  SECTION  may  be  declared  individually 
or  grouped  into  data  records. 
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A file  declaration  contains  several  options,  of  which 
only  a few  are  included  in  the  subset.  LABEL  and  DATA 
BECORD  options  are  included,  while  BLOCK,  RECORD,  VALUE  OF, 
LINAGE,  CODE-SET,  and  REPORT  are  eliminated.  BLOCK,  RECORD, 
VALUE-OF,  and  CODE-SET  are  items  of  value  to  the 
implementing  machine.  LINAGE  and  REPORT  are  used  by  the 
report  module  of  COBOL,  none  of  whose  primitives  are  part  of 
the  subset. 


Record  descriptions  are 
description  entry  can  designa 
it  contains  a level  number  and 
iiem,  in  which  case  its  pict 
If  the  value  of  an  elemen 
condition-name . a level-number 
description  of  the  values  that 
working-storage  variable  dec 
number  of  77.  The  first  name 
must  have  a level  number  o 
items,  any  two-digit  level  num 
may  be  used  . 


tree-structured.  A record 
te  a group  item . in  which  case 
a name,  or  an  elemen  tar v 
ure,  etc.,  are  also  described, 
tary  item  corresponds  to  a 
of  88  is  used  together  with  a 
signify  the  condition.  A 
lared  individually  has  a level 
in  a data  record  description 
f 01.  For  elementary  or  group 
ber  (except  01,  66,  77,  or  88) 


A data 
data  item, 
name  of  the 
referenced 
list  of  opti 


description  entry  characterizes  an  elementary 
It  consists  of  a level  number  followed  by  the 
item  or  FILLER  (if  the  item  is  not  to  be 
at  the  elementary  level  by  the  program)  and  a 
ons.  We  include  the  options  PICTURE,  JUSTIFIED, 
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I 

1 


and  VALUE.  Excluded  are  hEDEFINES,  USAGE,  SIGN,  OCCUhS, 
SYNCHRONIZED,  and  BLANK  ZERO.  hEDEFINES  and  OCCURS  have  not 
been  ax iomat i zed  , and  would  require  an  enlart^ement  of  the 
subset.  USAGE  will  only  be  DISPLAY  for  this  subset,  so  it 
was  eliminated  ( since  that  is  the  default).  SIGN  and  BLANK 
ZERO  can  be  handled  by  numeric  editing.  JUSTIFIED  provides 
for  the  alignment  of  characters  in  an  alphanumeric  item  when 
data  items  are  moved  t''  it.  VALUE  performs  initialization 
of  an  elementary  data  it*m. 

PICTURE  specifications  are  the  most  complicated  (and 


perhaps 

the 

most  i 

n teres 

dng ) 

part 

of  the 

DATA  DIVISION 

grammar 

. Ther 

e are 

three 

types 

of 

picture 

s,  with  the 

picture 

type 

determining 

the 

type 

of  the 

data  item. 

tic  i 

terns 

may  ^ 

contain 

le 

tter  s 

and  spaces. 

A1 Phan  umer ic 

items 

may  contain 

any 

pr intabl 

e characters. 

tiMmeris 

items 

contain 

fixed 

dec ima 

1 or 

integer 

values.  The 

picture 

sped 

fication  may 

indie 

ate 

that  the 

data  item  is 

in  whl 

ch  case 

change 

es  are 

made 

in  order 

to  print  out 

the  data  item.  Editing  can  take  two  forms-- in ser t ion  and 
zero  suppression . In  insertion,  extra  characters  are 
inserted  between  digits  in  the  edited  item.  The  nature  of 
the  insertion  may  depend  of  the  value  of  the  item.  In  zero 
suppression,  leading  zeros  (and  intervening  insertion 


characters)  to  the  left  of  the  decimal  point  are  replaced  by 
spaces,  asterisks,  or  spaces  followed  by  either  plus,  minus, 
or  currency  sign.  In  PICTURE  specifications,  the  kind  of 
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editing  is  descr 
Since  there  ar 
( corresponding 
grammar  for  the 


ibed  by 
e many 
to  the 
picture 


the  sequence  of  characters  involved, 
possible  character  combinations 
kinds  of  editing  to  be  done),  ube 
specifications  is  difficult  indeed. 


) 
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Appendix  A.  Grammar  for  the  Data  Division 


data-di vision 
= DATA  DIVISION 
= DATA  DIVISION 
= DATA  DIVISION 
r DATA  DIVISION 


file-section 

file-section  working-storage- section 
working-storage-section 


$character 
= $ 

= embeddedcharacter 


$*'tring 

= $character  $string 
= $string 


$str ing  1 
= ^string 

= $string  decimalpoint 
= $3tring  decimalpoint  $string 
r $string  pstring  impliedpoint 
= decimalpoint  $string 
= impliedpoint  pstring  $string 


•character 

- • 

= embeddedcharacter 


•string 
= •character 
= •character  •string 


Qcharacter 
= 9 

= embeddedcharacter 
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9string 
= 9character 
= 9character  9string 


creditdebit 


= OR 
= DB 


currency insert ion 
= $3tring  editstring 
= $string1 


data-description 
= 88  symbol  semi  value  ranges  . 

= level-number  data-name  . 

= level-number  data-name  picture  Justification  value-clause 


data-description 1 

= symbol  picture  justification  value-clause 


data-de script! on s 
5 data-description  . 

= data-description  . data-descr iptions 


data-name 
= FILLER 
= symbol 


data-record 

= semi  DATA  record  symbols 


decimalpoint 


Syntax  of  the  COBOL  DATA  DIVISION 
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editstring 
= edit3tring2 
= pstring 

= pstring  impliedpoint 


editstring  1 

= eu  i t Str  Ifigc! 

= impliedpoint  pstring  9string 
= pstring  9string 


editstring2 
= 9string 

= 9string  decimalpoint 
= 93tring  decimalpoint  9string 
= 9string  pstring 
= 9string  pstring  impliedpoint 
= decimalpoint  9string 


embedded character 

r 0 

- » 

= / 

= B 


file-descriptor 
= FD  symbol  label-record  . 

= FD  symbol  label-record  data-record  . record-description 


file-descriptors 
= file-descriptor 

= file-descriptor  file-descriptors 


file-section 
= FILE  SECTION  . 

= FILE  SECTION  . file-descriptors 
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I 


impliedpoint 
= V 


initialpart 
= $ 

= $ + 

= $ - 
= $ 

= - $ 


is 

= IS 


Just 
= JUST 
= JUSTIFIED 


Justification 

= semi  just 
= semi  Just  RIGHT 


label -record 

= semi  LABEL  record  OMITTED 
= semi  LABEL  record  STANDARD 


literal 
z number 
z string 


minuscharacter 


embedded character 
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1 

I 

I 

minusstring 
= min uscharacter 
= minuscharacter  minusstring 


numericedited 
= currencyinsertion 
= currencyinsertion  creditdebit 
= initialpart  numericedited  1 
= numericedited  1 
= numericedited!  creditdebit 
= sign  currencyinsertion 
r signinsertion 


numericedited  1 
= editstring! 

= zerosuppression 


picture 

= semi  picture-word  is  picture-spec 


picture-word 
= PIC 
= PICTURE 


pluscharacter 

z 

r embeddedcharacter 


plusstring 
r pluscharacter 
r pluscharacter  plusstring 


pstr in 
z P 
= P P 


g 

string 
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range 
= literal 

= literal  thru  literal 


ranges 
= range 

= range  , ranges 


record 
= RECORD 
= RECORD  IS 
= RECORDS 
= RECORDS  ARE 


record -description 
= 01  symbol  . data-descr iptions 


record -descriptions 
= record-description  . 

= record-description  . record-descriptions 


semi 


sign 
= + 


signin  ser tion 

: $ signstring  editstring 

= signstring  editstring 
= signstringl 
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signstring 
= minusstring 
= plusstring 


signstring  1 

= decimalpoint  signstring 
= impliedpoint  pstring  signstring 
= minusstring  decimalpoint  minusstring 
= plusstring  decimalpoint  plusstring 
= signstring 

= signstring  decimalpoint 
= signstring  pstring  impliedpoint 


suppressstring 
= •string 
= zstring 


suppressstringl 

= •string  decimalpoint  •string 
= decimalpoint  suppressstring 
= impliedpoint  pstring  suppressstring 
= suppressstring 
= suppressstring  decimalpoint 
= suppressstring  pstring  impliedpoint 
= zstring  decimalpoint  zstring 


symbols 
= symbol 
= symbol  symbols 


thru 

= THROUGH 
= THRU 


value 
= VALUE 
= VALUE  IS 
= VALUES 
= VALUES  ARE 
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value-clause 

= semi  VALUE  IS  literal 
= semi  VALUE  literal 


working- storage- list 
= 77  data-descr iption  1 . 

= 77  data-descr iption 1 . working-storage-list 

= record-description  . 

= renord-rtpRori ptlon  . working-storage-list 


working-storage-section 
= WORKING-STORAGE  SECTION  . 

= WORKING-STORAGE  SECTION  . working-storage-list 


zcharacter 
= Z 

= embeddedcharacter 


zero suppress ion 
= suppressstring  editstring 
= suppressstring  1 


zstr ing 
= zcharacter 
= ^character  zstring 


APPENDIX  III 


M.  W.  Green 

Axiomatizatlon  of  COBOL  Semantics 

This  section  reports  on  preliminary  work  toward  an  axiomatic  repre- 
sentation of  COBOL  semantics.  The  aim  is  to  describe  an  adequate,  but 
somewhat  restricted,  subset  of  COBOL  in  such  a way  that  automatic  or  semi- 
automatic generation  of  program  verification  conditions  is  facilitated. 

To  a considerable  extent  we  have  been  guided  by  Hoare's  axiomatization  of 
the  language  PASCAL  [1].  However,  COBOL  is  in  some  respects  a much  more 
complex  language  than  PASCAL,  so  that  some  additional  notational  and  meta- 
linguistic conveniences  had  to  be  improvised  to  describe  the  effect  of  certa 
COBOL  statements. 

The  COBOL  language  is  described  in  the  ANSI  Report  [2]  by  a collection 
of  syntatic  forms  accompanied  by  informal  or  "prose”  specifications  of  the 
effect  of  each  language  statement.  In  interpreting  this  document  we  have 
noticed  several  instances  where  a restriction  or  a relaxation  of  the  allowed 
language  expressions  would  be  helpful  in  formulating  useful  program  verifi- 
cation conditions.  Where  these  situations  arise,  we  have  arbitrarily  chosen 
to  use  the  most  convenient  interpretation  (or  restriction).  In  particular, 
we  should  mention 

1.  There  are  several  instances  where  the  description  of  COBOL 

syntax  (as  given  in  the  ANSI  report)  seems  unnecessarily  re- 
strictive. For  example,  the  GOTO. .. DEPENDING  ON  [id]  state- 
ment could  Just  as  well  accept  an  in teger-valucxl  arithmetic 
expression  (or  even  a COMPUTE...)  instead  of  a simple  iden- 
tifier. Where  we  could  see  no  reason  for  observing  this 
sort  of  language  restriction  we  have  omitted  it  from  the 
axiomatization  of  the  version  of  COBOL  that  actually  will  be 
used  in  program  proving.  It  a compiler  forbids  the  more 
relaxed  syntax,  then  the  program  is  not  well  formed  and  proof 
of  correctness  is  not  an  issue. 
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There  are  instances  where  a COBOL  statement  is  considered 
to  be  too  dangerous  or  too  difficult  to  cope  with  in  program 
verification.  Examples  are  the  ALTER  verb  and  the  MOVE 
statement  applicnl  to  group  data-items.  In  the  former  case, 
the  possibly  intricate  variations  in  flow  of  program  control 
are  very  difficult  to  handle.  In  the  latter  case,  the  pred- 
icates associated  with  all  of  the  possible  consequences  of 
an  unfomiattod  transfer  of  alphanumeric  data  are  extremely 
complex.  For  such  reasons  we  will  often  omit  some  COBOL 
statement  from  the  axiomatic  description  (forbid  them)  or 
restrict  the  generality  of  others. 

Where  the  syntatic  correctness  of  an  allowed  COBOL  statement 
is  clearly  checkable  by  a compiler,  we  will  assume  correctness 
on  the  part  of  the  compiler.  Furthermore  we  assume  that  certain 
run-time  checks  that  detect  operands  of  inappropriate  type  will 
be  compiled  into  the  executable  code.  This  means,  for  example, 
that  we  need  not  adjoin  predicates  to  an  ADD  statement  which 
assert  that  the  arguments  are  numeric  quantities.  The  consequences 
of  a run-time  error  in  data-type  may  be  handled  in  at  least  two 
sensible  ways.  The  first  would  attach  an  implicit  ON  ERROR  GOTO... 
to  each  statement  where  such  a situation  could  occur.  We  choose  a 
simpler  alternative  in  which  these  errors  signify  non-termination 
of  the  program.  This  is  consistent  with  Hoare's  treatment  of  PASCAL 
wherein  p(s)q  is  satisfied  if  S diverges. 

COBOL  is  a language  with  a rather  weak  notion  of  data-type,  and  a 
very  elaborate  collection  of  conversion  rules.  Other  algebraic 
languages  make  do  with  r few  standard  internal  representations  for 
data-types  and  a few  permissible  high-level  coercion  rules  such  as 
INTEGER  -(  REAL  = REAI.  to  preserve  integrity  of  data-type.  In  these 
languages  the  explicit  rounding  or  tuncation  of  numeric  quantities 
to  conform  with  non-standard  internal  representation  must  usually 
bo  accomplished  by  extra  arithmetic  manipulations  expressed  as  high- 
level  language  statements.  In  the  latter  respect  COBOL  Is  peculiar 
(Init  not  unique),  because  a simple  assignment  statement  need  not 
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preserve  numeric  equality  between  the  sending  and  receiving 
values.  For  this  reason  we  need  some  method  of  expressing 
the  effect  of  transmitting  the  value  of  an  elementary  COBOL 
data-item  to  a receiving  identifier.  The  route  we  have  taken 
is  to  express  these  conversion  rules  as  functions  (without 
side-effects)  that  accept  values  and  PICTURES  as  arguments  and 
return  values  equivalent  to  the  COBOL  conversion  rules.  (See 
details  below.) 

Illustrative  Examples 

1 . MOVE 

The  COBOL  MOVE  statement  is  the  analog  of  the  assignment  statement 
in  other  high-level  languages.  For  the  most  primitive  form  of  this  statement, 
MOVE  X TO  y ; the  corresponding  PASCAL  or  ALGOL  equivalent  is  y :=  x.  The  Hoare 
axiomatization  of  this  statement  would  be, 

r^{MOVE  X TO  y)p 
X 

where  the  notation  denotes  the  predicate  derived  from  P by  substitution 

X 

of  the  value  of  x for  y in  P.  Inforroally,  if  P is  true  after  execution  of 
{move  X TO  y}  then  is  also  true.  Now  the  MOVE  statement,  in  addition  to 
having  several  variational  forms,  may  also  modify  data  so  that  x / y after 
a MOVE.  This  occurs  whenever  x suffers  an  edi ting  operation  on  being  trans- 
ferred to  location  y.  All  such  editing  operations  may  be  described  by  functions 
having  no  side-effects  such  as  Edi t (x, pic^ ) . Here,  Edit  is  a function  of  two 
arguments,  the  value  of  x and  the  PICTURE  corresponding  to  the  variable  y.  The 
internal  details  of  the  function  Edit  implement  the  conversion  rules  described 
by  the  COBOL  report  and  the  function  definition  of  Edit  can  serve  to  define  the 
semantics  of  the  conversion  process. 

In  statements  that  manipulate  arithmetic  quantities,  CX)BOL  provides 
the  option  of  truncating  or  rounding  values  that  might  not  be  accommodated  to 
full  precision  in  the  receiving  picture.  Truncation  is  the  normal  default 
operation,  but  rounding  can  be  forced  by  the  use  of  the  ROUNDED  modifier  in 
most  arithmetic  operations.  The  effects  of  truncation  or  rounding  may  also 
be  described  by  editing  functions,  for  example, 

Eklittrunc(x,plCy)  and  Edi tround (x, pic^ ) 

with  equally  precise  internally  defined  semantics.  In  the  following  discussion 
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we  will  use  the  function  Edit  as  a generic  name  for  the  conversion  function. 
When  a ROUNDED  modifier  appears  in  a COBOL  statement  it  should  be  understood 
that  Editround  is  to  be  used  instead  of  Edit  etc. 

Standard  COBOL  permits  a MOVE  of  non-elementary  (group)  items  via 
an  unformatted  block-transfer  of  alphanumeric  information.  To  avoid  this 
dangerous  programming  practice,  we  make  a restriction  that  MOVE  x to  y where 
X and  y are  group  items  is  permitted  only  when  x and  y have  identical  picture- 
structure  as  defined  in  the  DATA  DIVISION.  With  this  constraint  all  MOVE 
statements  can  be  decomposed  into  MOVEs  of  elementary  items. 

Observing  these  conventions  the  axiom  for  the  MOVE  of  an  elementary 
data-item  becomes 


F^{m0VE  X TO  y}p,  where  E = Edit(x,pic  ) 

E y 

The  alternative  form,  Mot'S  x to  a,  b,  ...is  transduced  by  the  CTG  (see  Appendix  I) 
to  a sequence  of  simple  MOV'ES  and  therefore  does  not  require  separate  axiomati- 
zation.  However,  to  explicate  this  notion,  we  introduce  the  rule 

p3  , b , . . . 

E = Edit(x,pic  ),  E = Edit(x,pic  ) ... 

X St  ^ u 

Now,  according  to  Section  5.15.1  of  the  COBOL  standard,  value  sub- 
stitutions are  to  be  carried  out  in  sequence  rather  than  "simultaneously." 

Thus,  the  statements 

,\K)VE  i to  .j , a(j) 

MOVE  i to  a(.j),  j 

should  have  different  effects.  Consequently,  wo  will  observe  the  convention 
that  the  notation 


^a  , b . . . 

Ei,E2... 


stands  for  the  expression  resulting  from  first  substituting  E^  for  a in  P, 

then  substituting  E.^  for  b in  the  derived  expression,  etc.  This  differs  from 

the  interpretation  found  in  Hoare  fl],  where  simultaneous  substition  was  the  rule. 
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The  third  variant  of  the  MOVE  statement,  namely  MOVE  CORRESPONDING 
X TO  y,  involves  group  data-items.  In  the  internal  representation  of  the 
data  division  of  a COBOL  program  there  will  be  a form  of  symbol  table  that 
provides  a unique  address  (name)  for  each  elementary  data-ltem.  However  this 
information  is  kept,  an  equivalent  unique  name  for  each  elementary  item  can 
be  specified  by  forming  the  ordered  list  of  identifiers  (i,e.,  qualifiers) 
proceeding  from  the  name  of  the  data  item  upward  through  each  level  of  data 
subdivision  to  the  01  level.  For  example  in  the  data  structure 

01  RECORD 
02  BAZ 
n=;  A 
05  B 

the  namel ist  of  B is  (B,  BAZ,  RECORD),  and  this  is  entirely  equivalent  to 
the  specification  in  COBOL  syntax,  B IN  BAZ  IN  RECORD. 


Definition : 


Two  elementary  items  are  CORRESPONDING  with  respect  to  idj^  and  id^ 
if  idj^  4 id^  and  the  namelist  of  the  first  item  up  to  but  not  including  id^^  is 
identical  with  the  namelist  of  the  second  item  up  to  but  not  including 
Let  Z be  the  set  of  ordered  pairs 

and  E = Edit(x  , pic  )... 

1 ^ ^1 


/ 

then  the  ijnle 

y. 

V(x.y.)  e Z:  P^  {MOVE  x,  TO  y. }p 
11  1 1 


/ 


P_  [move  CORRESPWJDING  X TO  y)p 

• • • 


gives  the  semantic  interpretation  in  this  form  of  the  MO 
2.  GOTO 


The  GOTO  statement  has  two  variants. 
GOTO  procedure-name, 

GOTO  procedure-name^ , 1 prov  . 
DEPENDING  UN  it: 


F/i  f/i 


ao-aom  as7 


UNCLAMXFXfp 


STANFOW)  RCSCARCH  XNST  NCNLO  FAIW  CALXF 
THf  VUIFXCATXON  OF  COtOL  PROMAMS.  (U) 

~ 79  L ROtXNSON*  M « MCCN*  J M SFXT2CN 


END 


I -78 


I 

L 
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A suitable  axiomatic  treatment  of  the  GOTO  is  given  in  Knuth[3].  For 
CX)BOL  programs,  we  need  the  following  rules.  To  each  procedure~name  L in 
the  program  that  is  the  target  or  possible  target  of  a GOTO  we  must  provide 
a logical  assertion  predicate  P(L)  that  must  be  true  whenever  flow  of  pro- 
gram control  reaches  L.  Then 

P(L)  {goto  l}  false 

and  the  rule  of  inference 

P(L)  (bodvlQ 
P(L)  (l:  body Jq 

gives  the  appropriate  condition  that  the  GOTO  must  satisfy.  Here  body 
represents  the  statements  belonging  to  a procedure-name. 

For  the  second  version  of  the  GOTO  statement,  which  resembles  the 
ALGOL  switch  construct,  the  informal  semantics  are  that  the  identifier  is 
evaluated  to  an  integer  ^ and  control  is  transferred  to  the  ith  procedure 
in  the  procedure-name  list.  If  i is  not  an  integer  in  the  range  1 to  n 
(n  is  the  number  of  procedures  in  the  name  list)  then  the  statement  has  no 
effect,  i.e.,  control  "falls  through"  to  the  next  CXIBOL  statement. 

We  handle  this  construct  by  developing  the  UEPB40ING  ON  conditional 
into  a set  of  equivalent  IF  statements  during  the  transduction  phase  so  that 
no  separate  axiomatlzat ion  is  required. 

3.  IF-ELSE 


The  syntatlc  form  of  this  COBOL  statement  is 


IF  condition 


statement^ 
NEXT  S^^TENCE 


ELSE  statement 

A 

ELSE  NEXT  SENTENCE 


Here  some  restrictions  that  existed  in  earlier  versions  of  COBOL  have  been 
relaxed  in  the  present  ANSI  standards  to  permit  statement^  and  statement^ 
to  be  of  either  Imperative  or  conditional  type.  If  we  Interpret  the  phrase 
NEXT  SBiTB4CE  as  an  impecative  statement  having  no  effect,  then  its  axlom- 
itizatlon  is  simply 


p {next  sentence)  P 
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and  if  S and  S stand  for  permissible  statements  Including  NEXT  STATEMENT 

X M 

then 

PA  condition  PA  "n  condition 

pIiF  ctmdltlon  ELSE  S^Jq 

Is  the  appropriate  rule  of  Inference  for  the  IF-ELSE  statement. 

4.  ADD 

The  ADD  statement  In  (X>BOL  with  Its  several  variations  and  Its 
optional  error  exit  Is  perhaps  the  most  syntactically  complex  arithmetic 
statement  to  be  found  In  any  high-level  language.  Its  axlomatlzation  Is 
fairly  straightforward,  however,  having  much  the  same  form  as  that  of  the 
MOVE  statement.  (In  fact,  the  semantic  primitives  SET$  and  SETROUNDEO$  rep' 
resent  both  In  Abstract  COBOL.)  In  the  most  primitive  form 

ADD  X TO  y [ROUNDED] 

we  have,  by  analogy  with  MOVE, 

{add  X to  y}  P,  E = Edit  (x+y,plc  ) 

E y 

where  Edit  should  be  replaced  by  Edltround  If  the  ROUNDED  modifier  Is 

employed.  The  more  general  form 

ADO  x,y,z  ...  TO  U,v,w  ...  , 

we  currently  expand  Into  multiple  Internal  statements  so  the  latter  form 
needs  no  separate  axlomatlzation.  The  variant 

ADD  x,y,z  ...GIVING  w [ROUNDED] 

has  a similar  rule,  namely 

P*  (add  x,y,z  ...GIVING  w)  P,  E * Edit  (x-t-y-t-z  ...,plc  ) . 

£ w 

Here  also,  the  appearance  of  a list  of  variables  In  the  place  of  w would 
be  expanded  Into  separate  Internal  statestents.  A third  variant, 

ADD  CORRESPOHDING  X TO  Y [ROUiDEO], 

leads  to  an  axiom  set  very  similar  to  that  of  the  MOVE  (XXIRESPONDING  state- 
ment (see  1 above).  That  Is, 
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Z = ...|x^y^  CORBESPCKDING  in  X,y} 


E = Edit  (x  +y  ,plc  ) ... 
y 

V(x^y^)  e Z:  {add  x^  TO  y^^}? 


V • • • 

*’ej...{add  corresponding  X TO  y]p 


In  each  of  the  variants  of  the  ADD  statement,  an  optional  clause  [CW  SIZE 
ERROR  Imperative  statement]  may  be  attached  to  take  appropriate  action  If 
numeric  overflow  or  underflow  conditions  arise  In  the  computation.  There- 
fore, we  must  consider  the  family  of  statements  of  which 


ADD  X TO  y:  ON  SIZE  ERROR  S. 

is  typical  (where  S Is  some  Imperative  statement).  This  statement  requires 
several  axioms.  Let  "Sum-f lts(y ,x+y)"  be  the  assertion  that  the  result  of 
the  computation  "x+y"  fits  In  location  y.  Then  one  correct  rule  of  Inference 
Is 

p{adD  X TO  y}Q 

P & Sum-f Its (y, x+y) {add  x TO  y;  ON  SIZE  ERROR  s}q 


l.e..  In  the  absence  of  an  error,  the  error  condition  Is  superfluous.  Next, 
suppose  that  Sum-f lt8(y, x+y)  Is  false.  Then  a complete  axlomatlzation  must 
distinguish  two  cases.  The  first  case  Is  that  the  error  Is  detected  before 
y Is  modified.  Then  we  may  use  the  Inference  rule; 

Earl y-detect Ion (y .x+y)  It  (p{s)q)  fa  -<  Sum-fits (y, x+y) 
p{aDD  X to  y;  ON  SIZE  ERROR  s}q 

The  second  case  Is  that  an  error  Is  detected  after  y has  been  modified.  A 
complete  axlomatlzation  must  then  account  for  the  execution  of  S In  the  modi- 
fied environment.  We  Intend,  In  our  present  work,  to  make  the  simplifying 
assumption  that  this  case  does  not  arise. 


' 
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